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DECLARATION OF AVI ASHKENAZL Ph.D UNDER 37 C J Jt 81,132 



I, Avi Ashkenazi, Ph.D. declare and say as follows: - 

1. I am Director and Staff Scientist at the Molecular Oncology Department of 
Genentech, Inc., South San Francisco, CA 94080. 

2. I joined Genentech in 1988 as a postdoctoral fellow. Since then, I have 
investigated a variety of cellular signal transduction mechanisms, including apoptosis, and have 
developed technologies to modulate such mechanisms as a means of therapeutic intervention in 
cancer and autoimmune disease. I am currently involved in the investigation of a series of 
secreted proteins over-expressed in tumors, with the aim to identify useful targets for the 
development of therapeutic antibodies for cancer treatment. 

3. My scientific Curriculum Vitaie, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). 

4. Gene amplification is a process in which chromosomes undergo changes to 
contain multiple copies of certain genes that normallyexist as a single copy, and is an important 
factor in the pathophysiology of cancer. Amplification of certain genes (e.g.; Myc or Her2/Neu) 



gives cancer cells a growth or survival advantage relative to normal cells, and might also provide 
a mechanism of tumor cell resistance to chemotherapy or radiotherapy. 

5. If gene amplification results in over-expression of the mRNA and the 
corresponding gene product, then it identifies that gene product as a promising target for cancer 
therapy, for example by the therapeutic antibody approach. Even in the absence of over- 
expression of the gene product, amplification of a cancer marker gene - as detected* for example, 
by the reverse transcriptase TaqMan® PGR or the fluorescence in situ hybridization (FISH) 
assays -is useful in the diagnosis or classification of cancer, or in predicting or monitoring the 
efficacy of cancer therapy. An increase in gene copy number can result not only from 
intrachromosomal changes but also from chromosomal aneuploidy. It is important to understand 
that detection of gene amplification can be used for cancer diagnosis even if the determination 
includes measurement of chromosomal aneuploidy. Indeed, as long as a significant difference 
relative to normal tissue is detected, it is irrelevant if the signal originates from an increase in the 

. number of gene copies per chromosome and/dr an abnormal number of chromosomes. 

6. I understand that according to the Patent Office, absent data demonstrating that 
the increased copy number of a gene in certain types of cancer leads to increased expression of 
its product, gene amplification data are insufficient to provide substantial utility or well 

. established utility for the gene product (the encoded polypeptide), or an antibody specifically 
binding the encoded polypeptide. However, even when amplification of a cancer marker gene 
does not result in significant over-expression of the corresponding gene product, this very 
absence of gene product over-expression still provides significant information for cancer 
diagnosis and treatment. Thus, if over-expression of the gene product does not parallel gene 
amplification in certain tumor types but does so in others, then parallel monitoring of gene 
amplification and gene product over-expression enables more accurate tumor classification and 
hence better determination of suitable therapy. In addition, absence of over-expression is crucial 
information for the practicing clinician. If a gene is amplified but the corresponding gene 
product is not over-expressed, the clinician accordingly will decide riot to treat a patient with 
agents that target that gene product. 

7. I hereby declare that all statements made herein of my own knowledge are true 
and that all statements made on information or belief are believed to be true, and further that 
these statements were, made with the knowledge that willful false statements and the like so 



made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the 
United States Code and that such willful statements may jeopardize the validity of the 
application or any patent issued thereon. 

By: /4vnA^^^ > Date: lS"foj> 
Avi Ashkenazi, Ph.D. ' ~ 
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CURRICULUM VITAE 
Avi Ashkenazi 
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Personal: 

Date of birth: 

Address: 

Phone: 

Fax: 

Email: 



29 November, 1956 

1456 Tarrytown Street, San Mateo, CA 94402 
(650) 578-9199 (home); (650) 225-1853 (office) 
(650) 225-6443 (office) 
aa@gene.com 



Education: 

1983: 
.1986: 



B.S. in Biochemistry, with honors, Hebrew University, Israel 
Ph.D. in Biochemistry, Hebrew University, Israel 



Employment: 

1983-1986: 

1985- 1986: 

1986- 1988: 

1988- 1989: 

1989 - 1993: 

1994-1996: 

1996-1997: 

1997.1990: * 

1999-2002: 

2002-present: 



Teaching assistant, undergraduate level course in Biochemistry , 
Teaching assistant, graduate level course on Signal Transduction 
Postdoctoral fellow, Hormone Research Dept., UCSF, and 
Developmental Biology Dept., Genentech, Inc., with J. Ramachandran 
Postdoctoral fellow, Molecular Biology Dept., Genentech, Inc., . 
with D. Capon 

Scientist, Molecular Biology Dept., Genentech, Inc. 
Senior Scientist, Molecular Oncology Dept., Genentech, Inc. 
Senior Scientist and Interim director, Molecular Oncology Dept., 
Genentech, Inc. 

Senior Scientist and preclinical project team leader, Genentech, Inc. 

Staff Scientist in Molecular Oncology, Genentech, Inc. 

Staff Scientist and Director in Molecular Oncology, Genentech, Inc. 



Awards: 

1988: 



First prize, The Boehringer Ingelheim Award 
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Editorial: 

Editorial Board Member: Current Biology 
. Associate Editor, Clinical Cancer Research. 
Associate Editor, Cancer Biology and Therapy. 

Refereed papers: 

1 Gertler, A., Ashkenazi, A., and Madar, Z. Binding sites for human growth 

hormone and ovine and bovine prolactins in the mammary gland and liver of the 
lactating cow. Mol Cell Endocrinol 34, 5 1 -57 (1 984). 

2. Gertler, A., Shamay, A., Cohen, N., Ashkenazi, A., Friesen, H., Levanon, A., 
Gorecki, M., Aviv, H., Hadari, D., and Vogel, T. Inhibition of lactogenic 
activities of ovine prolactin and human growth hormone (hGH) by a novel form of 
a modified recombinant hGH. Endocrinology 118, 720-726 (1986). 

3. Ashkenazi, A., Madar, Z„ and Gertler, A. Partial purification and characterization 
of bovine mammary gland prolactin receptor. Mol Cell EndocrinQl 50, 79-87 
(1987). 

4. Ashkenazi, A., Pines, M., and Gertler, A. Down-regulation of lactogenic 

. hormone receptors in Nb2 lymphoma cells by cholera toxin. Biochemistry 
Internal 14, 1065-1072 (1987). 

5. Ashkenazi, A., Cohen, R., arid Gertler, A. Characterization of lactogen receptors 
in lactogenic hormone-dependent and independent Nb2 lymphoma cell.lines. 
FEBS Lett 210, 51-55 (1987). 

6. 1 Ashkenazi, A., Vogel, T., Barash, L, Hadari, D., Levanon, A., Gorecki, M., and 

Gertler, A. Comparative study on in vitro and in vivo modulation of lactogenic 
and somatotropic receptors by native human growth hormone and its iriodified 
recombinant analog. Endocrinology 121, 4 1 4-41 9 (1 987). 

7. Peralta, E., Winslow, J., Peterson, G., Smith, D., Ashkenazi, A., Ramachandran, 
J., Schimerlik, M., and Capon, D. Primary structure and biochemical properties 
of an M2 muscarinic receptor. Science 236, 600-605 (1987). 

8. Peralta, E. Ashkenazi, A., Winslow, J., Smith, D., Ramachandran, J., and Capon, 
D. J. Distincnt primary structures, ligand-bindirig properties and tissue-specific 
expression of four human muscarinic acetylcholine receptors. EMBO J. 6, 3923- 
3929(1987). 

9. Ashkenazi, A., Winslow, J., Peralta, E., Peterson, G., Schimerlik, M., Capon, D., 
and Ramachandran, J. An M2 muscarinic receptor subtype coupled to both 
adenylyl cyclase and phosphoinositide turnover. Science 238, 672-675 (.1987). 
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10. Pines- M» Ashkenazi, A., Cohen-Chapnik, N„ Binder, L., and Gertler, A. 
Inhibition of the proliferation of Nb2 lymphoma cells by femtomolar 
concentrations of cholera toxin and partial reversal of the effect by 1 2-o- 
tetradecanoyl-phorbot-13-acetate. J: Cell Biochem. 37, 119-129 (1988). 

11. Peralta. E. Ashkenazi, A., Winslow, J. Ramachandran, J., and Capoii, D. 
Differential regulation of PI hydrolysis and adenylyl cyclase by muscarinic 
receptor subtypes. Nature 334, 434-437 (1988). 

12. Ashkenazi., A. Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functionally distinct G proteins couple different receptors to PI hydrolysis in the 
same cell. Cell 56, 487-493 (1989). 

13. Ashkeriazi A., Ramachandran, J., and Capon, D. Acetylcholine analogue 
stimulates DNA synthesis in brain-derived cells via specific muscarinic 
acetylcholine receptor subtypes. Nature 340, 146-150 (1989). 

14. Lammare, D., Ashkenazi, A., Fleury, S., Smith, D., Sekaly, R., and Capon, D. 
The MHC-binding and gpl20-binding domains of CD4 are distinct and separable. 
Science 245, 743-745 (1989). 

15. Ashkenazi.. A., Presta, L., Marsters, S., Camerato, T., Rosenthal, K., Fendly, B ; , 
and Capon, D. Mapping the CD4 binding site for human immunodefficiency 
virus type 1 by alanine-scanning mutagenesis. Proc Natl Acad. Scl USA. 87, 
7150-7154(1990). 

16. Chamow, S., Peers, D., Byrn, R., Mulkerrin, M., Harris, R., Wang, W., kjorkman, 
P., Capon, D., and Ashkenazj A. Enzymatic cleavage of a CD4 immunoadhesin 
generates crystallizable, biologically active Fd-like fragments. Biochemistry 29, 
9885-9891 (1990). 

17. Ashkenazj A., Smith, D., Marsters, S., Riddle, L., Gregory, T., Ho, D., and 
Capon, D. Resistance of primary isolates of human immunodefficiency virus type 
1 to soluble CD4 is independent of CD4-rgpl20 binding affinity. Proc. Natl 
Acad. Set USA. 88, 7056-7060 (1991). 

1 8. Ashkenazi, A. v Marsters, S., Capon, D., Chamow, S., Figari., I., Pennica, D.; 
GoeddeL, D., Palladino, M., and Smith, D. Protection against endotoxic shock by 
a tumor necrosis factor receptor immunoadhesin. Proc. Natl Acad. Sci. USA. 88, 
10535-10539(1991). 

19. Moore. J. T McKeating. J.. Huang. Y.. Ashkenazi, A ., and Ho, D. Virions of 
primary HIV- 1 isolates resistant to sCD4 neutralization differ in sCD4 affinity and 
glycoprotein gpl20 retention from sCD4-sensitive isolates. J. Virol 66, 235-243 
(1992). 



3 



20. Jin, EL, Oksenberg, D., Ashkenazi, A., Peroutka, S., Duncan, A., Rozmahel., R., 
Yang, Y., Mengod, G., Palacios, J., and OTDowd, B. Characterization of the 
human 5-hydroxytryptamineiB receptor. J. Biol Chem. 267, 5735-5738 (1992). 

21. Marsters, A., Frutkin, A., Simpson, N., Fendly, B. and Ashkenazi, A. 
Identification of cysteine-rich domains of the type 1 tumor necrosis receptor 

' . i involved in ligand binding. Biol Chem. 267, 5747-5750 (1992). 

22. Ghamow, S., Kogan, T., Peers, D., Hastings, R., Byrn, R., and Ashkenazi. A. 
Conjugation of sCD4 without loss of biological activity via a novel carbohydrate- 
directed cross-linking reagent. J. Biol Chem. 267, 15916-15922 (1992). • 

23. Oksenberg, D., Marsters, A., OT)owd, B., Jin, H., Havlik, S., Peroutka, S., and 
Ashkenazi. A. A single amino-acid difference confers major pharmacologic 
variation between human and rodent 5-HTib receptors. Nature 360, 161-163 

(1992) . 

24. Haak-Frendscho, M., Marsters, S., Chamow, S., Peers, D., Simpson, N., and 
Ashkenazi. A. Inhibition of interferon y by an interferon y receptor 
immunoadhesin. Immunology 79, 594-599 (1993). 

25. Penica, D., Lam, V., Weber, R., Kohr, W., Basa, L., Spellman, M.. Ashkenazi. 
Shire, S., and Goeddel, D. Biochemical characterization of the extracellular 
domain of the 75-kd tumor necrosis factor receptor. Biochemistry 32,3131-3138. 

(1993) . 

26. Barfod, L., Zheng, Y., Kuang, W., Hart, M., Evans^ T., Cerione, R., and 
Ashkenazi. A. Cloning and expression of a human CDC42 GTPase Activating 
Protein reveals a functional SH3-binding domain. J. Biol Chem. 26$, 26059- 
26062 (1993). 

27. Chamow, S., Zhang, D., Tan, X., Mhtre, S., Marsters, S., Peers, D., Byrn, R., 
Ashkenazi. A.« and Yunghans, R. A humanized bispecific immiraoadhesin- 
antibody that retargets CD3+ effectors to kill HIV-1 -infected cells. J. Immunol 
153,4268-4280(1994). ♦ 

28. Means, R., Krantz, S„ Luna, J., Marsters, S., and Ashkenazi, A. Inhibition of 
murine erythroid colony formation in vitro by iterferon y and correction by 
interferon y receptor immunoadhesin. Blood 83, 91 1-915 (1994). 

29. Haak-Frendscho, M., Marsters, S.,Mordenti, J., Gillet, R, Chen, S., 
and Ashkenazi, A. Inhibition of TNF by a TNF receptor immunoadhesin: 
comparison with an anti-TNF mAb. J. Immunol 152, 1347-1353 (1994). 
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30. Chamow, S., Kogan, T., Venuti, M., Gadek, T., Peers, D., Mordenti, J., Shak, S., 
and Ashkenazi, A. Modification of CD4 immunoadhesin with monomethoxy- 
PEG aldehyde via reductive alkilation. Bioconj. Cherru 5,133-140(1 994). 

31. Jin, EL, Yang, R., Marsters, S., Bunting, S., Wurm, F., Chamow, S., and 
Ashkenazi, A. Protection against rat endotoxic shock by p55 tumor necrosis factor 
(TNF) receptor immunoadhesin: comparison to anti-TNF monoclonal antibody. J. 
Infect Diseases lib, 1323-1326(1994). 

32. Beck, J., Marsters, S., Harris, R., Ashkenazj A., and Chamow, S. Generation of 
soluble interleukin-1 receptor from an immunoadhesin by specific cleavage. Mol 
Immunol 31, 1335-1344 (1994). 

33; Pitti, B., Marsters, M., Haak-Frendscho, M., Osaka, G., Mordenti, J., Chamow, S., 
and Ashkenazi, A. Molecular and biological properties of an interleukin- 1 
receptor immunoadhesin. Mol /mmwwo/. 31, 1345-1351 (1994). 

34. ■ Oksenberg, D., Havlik, S., Peroutka, S., and Ashkenazi, A. The third intracellular 

loop of the 5-HT2 receptor specifies effector coupling. Neurochem. 64, 1440- 
1447(1995). 

35. Bach, E., Szabo, S., Dighe, A., Ashkenazi, A., Aguet,.M., Murphy, K., and 
Schreiber, R. Ligand-induced autoregulation of IFN-y receptor p chain expression 
in T helper cell subsets. Science 270, 1215-121 8 (1995). 

36. Jin, H., Yang, R., Marsters, S„ Ashkenazi, A., Bunting, S., Marra, M., Scott, R., 
and Baker, J. Protection against endotoxic shock by bactericidal/permeability- 
increasing protein in rats. J. Clin. Invest. 95, 1947-1952 (1995). 

37. Marsters, S., Penica, D., Bach, E., Schreiber, R., and Ashkenazi, A. Interferon y 
signals via a high-affinity multisubunit receptor complex that contains two types 
of polypeptide chain; Proc. Natl Acad. Set USA. 92,5401-5405 (1995). 

38. Van Zee, K., Moldawer, L;; Oldenburg, H., Thompson, W., Stackpole, S., 
Montegut, W., Rogy, M., Meschter, C, Gallati, H., Schiller, C, Richter, W., 
Loetcher. EL Ashkenazi, A ., Chamow, S., Wurm, F., Calvano, S., Lowry, S., and 
Lesslauer, W. Protection against lethal E. coli bacteremia in baboons by 
pretreatmeht with a 55-YD& TNF receptor-Ig fusion protein, Ro45-208 1 . J: 
Immunol 156, 2221-2230 (1996). 

39. Pitti, R., Marsters, S., Ruppert, S., Donahue, C, Moore, A., and Ashkenazi, A . 
Induction of apoptosis by Apo-2 Ligand, a new member of the tumor necrosis ~ 
factor cytokine family. 1 Biol Chem. Ill, 12687-12690 (1996). 
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40. Marsters, S., Pitti, R., Donahue, C, Rupert, S., Bauer, K., and Ashkenazi, A . 
Activation of apoptosis by Apo-2 ligand is independent of FADD but blocked by 
CrmA: Curr. BioL 6, 1669-1676 (1996). 

41. Marsters, S., Skubatch, M., Gray, C, and Ashkenazi A . Herpesvirus entry 
mediator, a novel member of the tumor necrosis factor receptor family, activates 
the NF-kB and AP-1 transcription factors. BioL Chem. 272, 14029-14032 
(1997). 

42. Sheridan, J., Marsters, S., Pitti, R., Gurney, A., Skubatch, M., Baldwin, D., 
Ramakrishnan, L., Gray, C., Baker, K., Wood, W.L, Goddard, A., Godowski, P., and 
Ashkenazi A. Control of TRAIL-induced apoptosis by a family of signaling and 
decoy receptors. Science 111, 818-821 (1997). 

43. . Marsters, S., Sheridan, J., Pitti, R., Gurney, A., Skubatch, M., Balswin, D., Huang, A., 

Yuan, J., Goddard, A., Godowski, P., and Ashkenazi, A. A novel receptor for 
Apo2L/TRAIL contains a truncated death domain. Curr. BioL 7, 1003-1006 (1997).. 

44. Marsters, A., Sheridan, J., Pitti, R., Brush, J., Goddard, A., and Ashkenazi, A. 
Identification of a ligand for the death-domain-containing receptor Apo3. Curr. BioL 
8, 525-528 (1998). 

45. Rieger, J., Naumann, U;, Glaser, T., Ashkenazi, A ., and Weller, M. Apo2 ligand: 
a novel weapon against malignant glioma? FEBS Lett 427, 124-128 (1998). 

46. Pender, S., Fell, J., Chamow, S., Ashkenazu A ., and MacDonald, T. A p55 TNF 
receptor immunoadhesin prevents T cell mediated intestinal injury by inhibiting 
matrix metalloproteinase production. J. Immunol. 160, 4098-4103 (1998). 

47. Pitti, R., Marsters, S., Lawrence, D., Roy, Kischkel, F., M., Dowd, P., Huang, A., 
Donahue, C, Sherwood, S., Baldwin, D., Godowski, P., Wood, W., Gurney, A., 
Hillan, K., Cohen, R., Goddard, A., Botstein, D., and Ashken^LA. Genomic 
amplification of a decoy receptor for Fas ligand in lung and colon cancer. Nature 
396,699-703(1998). 

48. Mori, S., Marakami-Mori, K., Nakamura, S., Ashkenazi, A ., and Bonavida, B. 
Sensitization of AIDS Kaposi's sarcoma cells to Apo-2 ligand-induced apoptosis 
by actinomycin D. J. Immunol. 162, 5616-5623 (1999). 

49. Gurney, A. Marsters, S., Huang, A., Pitti, R., Mark, M., Baldwin, D., Gray, A., 
Dowd, P., Brush, J., Heldens, S., Schow, P., Goddard, A., Wood, W., Baker, K., 
Godowski, P., and Ashkenazi, A: Identification of a new member of the tumor 
necrosis factor family and its receptor, a human ortholog of mouse GITR. Curr. 
BioL 9, 215-218 (1999). 
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50. Ashkenazi. A .^Pai, R., Fong, s., Leung, S, Lawrence, D., Marsters, S., Blackie, 
C, Chang, L., McMurtrey, A., Hebert, A., DeForge, L., Khoumenis, I., Lewis, D., 
Harris, L., Bussiere, J., Kpeppen, H,, Shahrokh, Z., and Schwall, R. Safety and 
anti-tumor activity of recombinant soluble Apo2 ligand. J. Clin. Invest. 104, 155- 
162 (1999). 

51. " Chuntharapai, A., Gibbs, V., Lu, J., Ow, A., Marsters, S., Ashkenazi, A., De Vos, 

A., Kim, K.J. Determination of residues involved in ligand binding and signal 
transmissiion in the human IFN-a receptor 2. J. Immunol. 1 63, 766-773 (1999). 

52. Johnsen, A.-C, Haux, J., Steinkjer, B., Nonstad, U;, Egeberg, K., Sundan, A., 
Ashkenazi. A., and Espevik, T. Regulation of Apo2L/TRAlL expression in NK 

. cells - involvement in NK cell-mediated cytotoxicity. Cytokine 11, 664-672 
(1999). 

53. . Roth, W., Iserimann, S., Naumann, U., Kugler, S., Bahr, M., Dichgans, 

. Ashkenazi. A., and Weller, M. Eradication of intracranial human malignant 
glioma xenografts by Apo2L/TRAIL. Biochem. Bidphys. Res. Commun. 265, 479- 
483 (1999). 

54. Hymowitz, S.G., Christinger, H.W., Fuh, G., Ultsch, M., O'Connell, M., Kelley, 
R.F., Ashkenazi. A. and de Vos, A.M. Triggering Cell Death: The Crystal 
Structure ^ of Apo2L/TRAIL in a Complex with Death Receptor 5. Molec. Cell 4, 
563-571 (1999). 

55. Hymowitz, S.G., O' Connel, M.P . , Utsch, M.H!, Hurst, A., Totpal, K., Ashkenazi, 
A^, de Vos, A.M., Kelley, R.F. A unique zinc-binding site revealed by a high- 
resolution X-ray structure of homotrimeric Apo2L/TRAIL. Biochemistry 39, 633- 
640 (2000). 

56. Zhou, Q-, Fukushima, P., DeGraff, W., Mitchell, J.B., Stetler-Stevenson, M., 
Ashkenazi.-A„ and Steeg, P.S. Radiation and the Apo2L/TRAIL apoptotic 
pathway preferentially inhibit the colonization of premalignant human breast 
cancer cells overexpressing cyclin Dl . Cancer Res. 60, 261 1-261 5 (2000). 

57. Kischkel, RC, Lawrence, D. A., Chuntharapai, A., Schow, P:, Kim, J., and 
Ashkenazi, A. Apo2L/TRAIL-dependent recruitment of endogenous FADD and 
Caspase-8 to death receptors 4 and 5. Immunity 12, 61 1-620 (2000). 

58. Yan, M., Marsters, S.A., Grewal, I.S., Wang, H., ♦AjhkenazLA,, and *Dixit, 
V.M. Identification of a receptor for BlyS demonstrates a crucial role in humoral 
immunity. Nature Immunol. 1, 37-41 (2000). 
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59. Marsters, S A., Yan, M., Pitti, R.M., Haas, P.E., Dixit, V.M., and Ashkenazi A. 
Interaction of the TNF homologues BLyS and APRIL with the TNF receptor 
homologues BCMA and TACI. Curr.Biol 10, 785-788 (2000). 

60. Kischkel, F.C., and Ashkenazi, A . Combining enhanced metabolic labeling with 
hnmunoblotting to detect interactions of endogenous cellular proteins. 
Biotechniques 29, 506-512 (2000). 

61 . Lawrence, D., Shahrokh, Z., Marsters, S., Achilles, KL, Shih, D. Mounho, B., 
Hillari, K., Totpal, K. DeForge, L., Schow, P., Hooley, L, Sherwood, S., Pai, R., 
Leung, S., Khan, L., Gliniak, B., Bussiere, J,, Smith, C, Strom* S., Kelley, S., 
Fox, J., Thomas, D., and Ashkenazi, A. Differential hepatocyte toxicity of 
recombinant Apo2L/TRAIL versions. Nature Med. 7, 383-385 (2001). 

62. Chuntharapai, A., Dodge, K., Grimmer, K., Schroeder, K., Martsters, S.A., 
Koeppen, H., Ashkenazi A ., and Kim, K.J. Isotype-dependent inhibition of 
tumor growth in vivo by monoclonal antibodies to death receptor 4. J. Immunol. 
166,4891-4898(2001). 

63. Pollack, I.F., Erff, M., and Ashkenazi, A . Direct stimulation of apoptotic 
signaling by soluble Apo2L/tumor necrosis factor-related apoptosis-inducing 
ligand leads to selective killing of glioma cells. Clim Cancer Res. 7, 1362-1369 
(2001). 

64. Wang, H., Marsters, S.A., Baker, T., Chan, B., Lee, W.P., Fu, L., Tumas, D., Yan, 
M., Dixit, V.M., * Ashkenazi, A ., and *Grewal, LS. TACI-ligand interactions are 
required for T cell activation and collagen-induced arthritis in mice. Nature 
Immunol. 2 9 632-637 (2001). 

65. Kischkel, F.C., Lawrence, D. A., Tinel, A., Virmani, A., Schow, P., Gazdar, A., 
Blenis, J., Amott, D., and Ashkenazi, A . Death receptor recruitment of 
endogenous caspase-10 and apoptosis initiation in the absence of caspase-8. J. 
Biol Chem. 276, 46639-46646 (2001). 

66. LeBlanc, H., Lawrence, D.A., Varfolomeev, E., Totpal, K., Morlan, J., Schow, P., 
Fong, S., Schwall, R., Sinicropi, D., and Ashkenazi, A T umor cell resistance to 
death receptor induced apoptosis through mutational inactivation of the 
proapoptotitc Bcl-2 homolog Bax. Nature Med. 8, 274-281 (2002). 

67. Miller, K., Meng, G., Liu, J., Hurst, A., Hsei, V., Wong, W-L., Ekert, R, 
Lawrence, D., Sherwood, S., DeForge, L., Gaudreault., Keller, G., Sliwkowski, 
M., Ashkenazi, A ., and Presta, L. Design, Construction, and analyses of 
multivalent antibodies. J. Immunol. 170, 4854-4861 (2003). 
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68. Varfolomeev, E., Kischkel, F., Martin, F., Wanh, H., Lawrence, D., Olsson, C, 
Tom, L., Erickson, S., French, D., Schow, P., Grewal, I. and Ashkenazi, A. 
Immune system development in APRIL knockout mice. Submitted. 

Review articles: 

1 . Ashkeiiazi, A., Peralta, E;, Wirislow, J., Ramachandran, J., and Capon, D., J. 
Functional role of muscarinic acetylcholine receptor subtype diversity. Cold 

\ Spring Harbor Symposium on Quantitative Biology. LIII, .263-272 (1988). 

2. Ashkenazi, A ., Peralta, E., Winslow, J., Ramachandran, J., and Capon, D. 
Functional diversity of muscarinic receptor subtypes in cellular signal 
transduction and growth. Trends Pharmacol Set Dec Supplement, 12-21 (1989). 

3. Chamow, S., Duliege, A., Ammann, A., Kahn, L, Allen, D., Eichberg, J., Byrn, 
R., Capon, D., Ward, R., and Ashkenazi, A . CD4 immunoadhesins in anti-HIV 
therapy: new developments. Int. J. Cancer Supplement 7, 69-72 (1992). 

4. - Ashkenazi, A ., Capon, and D. Ward, R. Immunoadhesins. Int. Rev. Immunol 10, 

217-225 (1993). 

5. Ashkenazi, A ., and Peralta, E. Muscarinic Receptors. In Handbook of Receptors 
and Channels. (S. Peroutka, ed.), CRC Press, Boca Raton, Vol. I, p. 1-27, (1994). 

6. Krantz, S. B., Means, R. T., Jr., Lina, J., Marsters, S. A., and Ashkenazi, A . 
Inhibition of erythroid colony fonnation in vitro by gamma interferon. In 
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.10 



9. Apo-2 Ligand, a new member of the TNF family that induces apoptosis in tumor 
cells. Cambridge Symposium on TNF and Related Cytokines in Treatment of 
Cancer. Hilton-Head, NC, March 1996. 
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11 



1 



25. Apoptosis control by death and decoy receptors. American Association for 
Cancer Research Conference, Whistler, BC, Canada, March 1 999. 
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32. Apoptosis and cancer therapy. University of Pennsylvania School of Medicine, . 
Philadelphia, PA, Apr 2000. 
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mON OF PAUL POLAKIS, Ph.D. 
I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of ray primary responsibilities has 
been leading Genentecfts Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels, on corresponding normal cells. We call such differentially expressed 
proteins "tumor antigen proteins'*. When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effective therapeutic in the treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA, mRNA and protein levels. An important example of one such 
technique is the well known and widely used technique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tissue or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levels than in corresponding normal human cells. To date, we 
have generated antibodies that bind to about 30 of the tumor antigen proteins 
expressed from these differentially expressed gene transcripts and have used these 
antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have then compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragraph 4 
above, we have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that rnRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed from that rnRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of rnRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased rnRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my 
opinion that such reports are exceptions to the commonly understood general rule 
that increased rnRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. 
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I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I am currently employed by Genentech, Inc, where my job title is Suiff 
Scientist. 

2. Since joining Genentech in 1999, one of my primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research 
project with a primary focus on identifying tumor cell markers rhar find use 
as targets for both the diagnosis and treatment of cancer in humans. 

3. As I stated in ray previous Declaration dated May 7, 2004 {attached as 
Exhibit A), my laboratory has been employing a variety of techniques, 
including microarray analysis, to identify genes which are differentially 
expressed in human tumor tissue relative to normal human tissue. The 
primary purpose of this research is to identify proteins that are abundantly 
expressed on certain human tumor tissue(s) and that are either (i) not 
expressed, or (ii) expressed at detectably lower levels, on normal ussue(s)! 

4. In the course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor tissue 
at significantly higher levels than in normal human tissue. To date, we 
have successfully generated antibodies that bind to 31 of the tumor antigen 
proieins expressed from these differentially expressed gene transcripts and 
have used these antibodies to quantitatively determine the level of 
production of these tumor antigen proteins in both human tumor tissue and 
normal tissue. We have then quantitatively compared the levels of mRNA 
and protein in both the tumor and normal tissues analyzed. The results of 
these analyses are attached herewith as Exhibit B. In Exhibit B, means 
that the mRNA or protein was detectably overexpressed in the tumor tissue 
relative to normal tissue and means that no detectable overexpression 
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overexpressed in human tumor tissue as compared to normal human tissue 
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tumor tissue and (ii) normal tissue, we have observed that in the vast 
majority of cases, there is a very strong correlation between increases in 
mRNA expression and increases in the level of protein encoded by that 
mRNA. 



6. Based upon my own experience accumulated in more than 20 years of 
research, including the data discussed in paragraphs 4-5 above and my 
knowledge of rhe relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor 
tissue relative to a normal tissue more often than not correlates to a similar 
increase in abundance of the encoded protein in the tumor tissue relative to 
rhe normal tissue. In fact, it remains a generally accepted working 
assumption in molecular biology that increased mRNA levels are more 
often than not predictive of elevated levels of the encoded protein. In fact, 
an entire industry focusing on the research and development of therapeutic 
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7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be 
true, and further that these statements were made with the knowledge that 
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imprisonment, or both, under Section 1001 of Title 18 of the United States 
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DECLARATION OF PAUL POLARIS, Ph.D. 
I, Paul Polakis, Ph.D., declare and say as follows: 

1 . I was awarded a Ph.D. by the Department of Biochemistry of the Michigan 
State University in 1984. My scientific Curriculum Vitae is attached . to and forms 
part of this Declaration (Exhibit A). 

2. I am currently employed by Genentech, Inc. where my job title is Staff 
Scientist. Since joining Genentech in 1999, one of ray primary responsibilities has 
been leading Genentech's Tumor Antigen Project, which is a large research project 
with a primary focus on identifying tumor cell markers that find use as targets for 
both the diagnosis and treatment of cancer in humans. 

3. As part of the Tumor Antigen Project, my laboratory has been analyzing 
differential expression of. various genes in tumor cells relative to normal cells. 
The purpose of this research is to identify proteins that are abundantly expressed 
on certain tumor cells and that are either (i) not expressed, or (ii) expressed at 
lower levels v on corresponding normal cells. We call such differentially expressed • 
proteins "tumor antigen proteins". When such a tumor antigen protein is 
identified, one can produce an antibody that recognizes and binds to that protein. 
Such an antibody finds use in the diagnosis of human cancer and may ultimately 
serve as an effect! ve therapeutic in die treatment of human cancer. 

4. In the course of the research conducted by Genentech's Tumor Antigen 
Project, we have employed a variety of scientific techniques for detecting and 
studying differential gene expression in human tumor cells relative to normal cells, 
at genomic DNA,.mRNA and protein levels. An important example of one such 
technique is the well known and widely used tecluiique of microarray analysis 
which has proven to be extremely useful for the identification of mRNA molecules 
that are differentially expressed in one tiss\ic or cell type relative to another. In the 
course of our research using microarray analysis, we have identified 
approximately 200 gene transcripts that are present in human tumor cells at 
significantly higher levels than in corresponding normal human cells. To date, we 
have generated antibodies that bind to about 30 of the tumor antigen proteins 

expressed trom these dffferelvltaf ly express^^en^tran5cripts-and43avcrused-these 

antibodies to quantitatively determine the level of production of these tumor 
antigen proteins in both human cancer cells and corresponding normal cells. We 
have tlien compared the levels of mRNA and protein in both the tumor and normal 
cells analyzed. 

5. From the mRNA and protein expression analyses described in paragraphs 
above, wc have observed that there is a strong correlation between changes in the 
level of mRNA present in any particular cell type and the level of protein 



expressed from that mRNA in that cell type. In approximately 80% of our 
observations we have found that increases in the level of a particular mRNA 
correlates with changes in the level of protein expressed fix>m that mRNA when 
human tumor cells are compared with their corresponding normal cells. 

6. Based upon my own experience accumulated in more than 20 years of 
research, including thedata discussed in paragraphs 4 and 5 above and my 
knowledge of the relevant scientific literature, it is my considered scientific 
opinion that for human genes, an increased level of mRNA in a tumor cell relative 
to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell. In fact, it remains a 
central dogma in molecular biology that increased mRNA levels are predictive of 
corresponding increased levels of the encoded protein. While there have been 
published reports of genes for which such a correlation does not exist, it is my- 
opinion that such reports are exceptions to the commonly understood general rule 
that increased mRNA levels are predictive of corresponding increased levels of the 
encoded protein. 

7. I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information or belief are believed to be true, 
and further that these statements were made with the knowledge that willful false 
statements and the like so made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States Code and that such willful 
statements may jeopardize the validity of the application or any patent issued 
thereon. - 
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Genome-wide Study of Gene Copy Numbers, 
Transcripts, and Protein Levels in Pairs of 
Non-invasive and Invasive Human Transitional 
Cell Carcinomas* 
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Gain and loss of chromosomal material is characteristic 
of bladder cancer, as well as malignant transformation in 
general. The consequences of these changes at both the 
transcription and translation levels is at present unknown 
partly because of technical limitations. Here we have at- 
tempted to address this question in pairs of non-Invasive 
and invasive human bladder tumors using a combination 
of technology that included comparative genomic hybrid- 
ization, high density oligonucleotide array-based monitor- 



phenomenon at both the transcription and translation levels. 
High throughput array studies of the breast cancer cell tine 
BT474 has suggested that there Is a correlation between 
DNA copy numbers and gene expression in highly amplified 
areas (2), and studies of individual genes in solid tumors 
have revealed a good correlation between gene dose and 
mRNA or protein levels in the case of c-erb-B2, cyclin d1 t 
emsl, and N-myc (3-5). However, a high cyclin D1 protein 
expression has been observed without simultaneous am- 



ing of transcript levels (5600 genes), and high resolution 
two-dimensional gel electrophorosis/the resutts showed^P> ,ficatlon <*)■ and a low level of c-myc copy number in- 
that there is a gene dosage effect that in some cases^ crease was observed without concomitant c-myc protein 
superimposes on other regulatory mechanisms. This ef- 
fect depended (p < 0.015) on the magnitude of the com- 
parative genomic hybridization change. In general (18 of 
23 cases), chromosomal areas with more than 2-fold gain 
of DNA showed a corresponding increase in mRNA tran- 
scripts. Areas with loss of DNA, on the other hand, 
showed either reduced or unaltered transcript levels; Be- 
cause most proteins resolved by two-dimensional gels 
are unknown it was only possible to compare mRNA and 
protein alterations irrelatively few cases of well focused 
abundant proteins. ^Vith few exceptions we found a good 
correlation (p < 0.005) between transcript alterations and 
protein levels. The implications, as well as limitations, 
of the approach are discussed. Molecular & Cellular 
Proteomics 1:37-45,2002. 



Aneuploidy is a common feature of most human cancers 
(1), but little is known about the genome-wide effect of this 
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overexpresslon (6). 

In human bladder tumors, karyotyping, fluorescent In situ 
hybridization, and comparative genomic hybridization (CGH) 1 
have revealed chromosomal aberrations that seem to be 
characteristic of certain stages of disease progression. In the 
case of noninvasive pTa transitional cell carcinomas (TCCs), 
this includes loss of chromosome 9 or parts of it, as well as 
loss of Y in males. In minimally Invasive pT1 TCCs, the fol- 
lowing alterations have been reported: 2q~, 11p-. 1q+, 
11q13+, 17q+, and 20q+ (7-12). It has been suggested that 
these regions harbor tumor suppressor genes and onco- 
genes; however, the large chromosomal areas involved often 
contain many genes, making meaningful predictions of the 
functional consequences of losses, and gains very difficult 

In this investigation we have combined genome-wide tech- 
nology for detecting genomic gains and losses (CGH) with 
gene expression profiling techniques (microarrays and pro- 
teomics) to determine the effect of gene copy number on 
transcript and protein levels in pairs of non-Invasive and in- 
vasive human bladder TCCs. 

EXPERIMENTAL PROCEDURES 

Materia (—Bladder tumor biopsies were sampled after Informed 
consent was obtained and after removal of tissue for routine pathol- 
ogy examination. By light microscopy tumors 335 and 532 were 
staged by an experienced pathologist as pTa (superficial papillary), 

1 The abbreviations used are: CGH, comparative genomic hybrid- 
ization; TCC, transitional cell carcinoma; LOH, loss of heterozygosity; 
PA-FABP, psoriasis-associated fatty acid-binding protein; 2D, 
two-dimensional. 
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Fig. 1 . DNA copy number and mRNA expression level. Shown from left to right are chromosome (Cnr.). CGH profiles, gene location and 
expression level of specific genes, and overall expression level along the chromosome. A, expression of mRNA In invasive tumor 733 as 
compared with the non-invasive counterpart tumor 335. B, expression of mRNA in invasive tumor 827 compared with the non-invasive 
counterpart tumor 532. The average fluorescent signal ratio between tumor DNA and normal DNA Is shown along the length of the chromosome 
(left). The bofd curve in the ratio profile represents a mean of four chromosomes and is surrounded by thin curves indicating one standard 
deviation. The central vertical tine (jbroken) indicates a ratio value of 1 (no change), and the vertical lines next to it (dotted) indicate a ratio of 
0.5 (teft) and 2.0 (pght). In chromosomes where the non-invasive tumor 335 used for comparison showed alterations in DNA content, the ratio 
profile of that chromosome is shown to the right of the invasive tumor profile. The colored bars represents one gene each, identified by the 
running numbers above the bars (the name of the gene can be seen at www.MDLDKZsdata.html). The bars indicate the purported location of 
the gene, and the colors indicate the expression level of the gene in the invasive tumor compared with the non-invasive counterpart; >2-fold 
increase (Mac*), >2-fold decrease (blue), no significant change (orange). The bar to the far right, entitled Express/on shows the resulting change 
in expression along the chromosome; the colors indicate that at least half of the genes were up-regulated (black), at least half of the genes 
down-regulated (blue), or more than half of the genes are unchanged (orange). If a gene was absent in one of the samples and present in 
another, it was regarded as more than a 2-fold change, A 2-fold level was chosen as this corresponded to one standard deviation in a double 
determination of —1800 genes. Centromeres and heterochromatic regions were excluded from data analysis. 



grade I and II, respectively, tumors 733 and 827 were staged as pT1 
(invasive into submucosal 733 was staged as solid, and 827 was 
staged as papillary, both grade III. 

mRNA Preparation —Tissue biopsies, obtained fresh from surgery, 
were embedded immediately in a sodium-guanldinium thiocyanate 
solution and stored at -80 °C. Total RNA was isolated using the 
RNAzol B RNA isolation method (WAK-Chemie Medical GMBH). 
poly(A) + RNA was Isolated by an oligo(dT) selection step (Oligotex 
mRNA kit; Qiagen). 

cRNA Preparation—! n9 of mRNA was used as starting material. 
The first and second strand cDNA synthesis was performed using the 
Superscript® choice system (Invitrogen) according to the manufac- 
turer's instructions but using an oIigo(dT) primer containing a T7 RNA 
polymerase binding site. Labeled cRNA was prepared using the ME- 
GAscrip® in vitro transcription kit (Ambion). Biotin-labeled CTP and 



UTP (Enzo) was used, together with unlabeled NTPs in the reaction. 
Following the in vitro transcription reaction, the unincorporated nu- 
cleotides were removed using RNeasy columns (Qiagen). 

Array Hybridization and Scanning— Array hybridization and scan- 
ning was modified from a previous method (13). 10 ^g of cRNA was 
fragmented at 94 °C for 35 min in buffer containing 40 mM Tris 
acetate, pH 8.1, 100 mM KOAc, 30 mM MgOAc. Prior to hybridization, 
the fragmented cRNA in a 6x SSPE-T hybridization buffer (1 m NaCI, 
10 mM Tris, pH 7.6, 0.005% Triton), was heated to 95 *C for 5 min, 
subsequently cooled to 40 °C, and loaded onto the Affymetrix probe 
array cartridge. The probe array was then incubated for 16 h at 40 °C 
at constant rotation (60 rpm). The probe array was exposed to 10 
washes in 6x SSPE-T at 25 °C followed by 4 washes in 0.5x SSPE-T 
at 50 °C. The biotinylated cRNA was stained with a streptavidin- 
phycoerythrin conjugate. 10 ^g/m\ (Molecular Probes) in 6x SSPE-T 
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Fig. 1— continued 



for 30 min at 25 °C followed by 1 0 washes in 6x SSPE-T at 25 °C. The 
probe arrays were scanned at 560 ran using a confocal laser scanning 
microscope (made for Affymetrix by Hewlett-Packard). The readings 
from the quantitative scanning were analyzed by Affymetrix gene 
expression analysis software. 

Microsatellite Ana/ys/s— Microsatetlite Analysts was performed as 
described previously (14). Microsatellites were selected by use of 
www.ncbi.nlm.nih.gov/genemap98, and primer sequences were ob- 
tained from the genome data base at www,gdb.org. DNA was extracted 
from tumor and Wood and amplified by PCR In a volume of 20 *d for 35 
cycles. The amplicons were denatured and electrophoresed for 3 h in an 
ABI Prism 377. Data were collected in the Gene Scan program for 
fragment analysis. Loss of heterozygosity was defined as less than 33% 
of one allele detected in tumor amplicons compared with Wood. 

Proteomic Analysis— TCCs were minced into small pieces and 
homogenized in a small glass homogenlzer in 0.5 ml of lysis solution. 
Samples were stored at -20 °C until use. The procedure for 2D gel 
electrophoresis has been described in detail elsewhere (15, 16). Gels 
were stained with silver nitrate and/or Coomassie Brilliant Blue. Pro- 
teins were Identified by a combination of procedures that included 
microsequencing, mass spectrometry, two-dimensional gel Western 
immunoblotting, and comparison with the master two-dimensional gel 
image of human keratinocyte proteins; see biobase.dk/cgi-bin/celis. 

CGH— Hybridization of differentially labeled tumor and normal DNA 
to normal metaphase chromosomes was performed as described 
previously (10). Fluorescein-labeJed tumor DNA (200 ng) t Texas Red- 



labeled reference DNA (200 ng), and human Cot-1 DNA (20 fig) were 
denatured at 37 *C for 5 min and applied to denatured normal met- 
aphase slides. Hybridization was at 37 °C for 2 days. After washing, 
the slides were counterstained with 0.15 fig/ml 4,6-diamidino-2-phe- 
nyitndole in an anti-fade solution. A second hybridization was per- 
formed for alt tumor samples using fluoresceJn-labeled reference DNA 
and Texas Red-labeled lumor DNA (inverse labeling) to confirm the 
aberrations detected during the initial hybridization. Each CGH ex- 
periment also included a normal control hybridization using fluores- 
cein- and Texas Red-labeled normal DNA. Digital image analysis was 
used to identify chromosomal regions with abnormal fluorescence 
ratios, indicating regions of DNA gains and losses. The average 
green:red fluorescence intensity ratio profiles were calculated using 
four images of each chromosome (eight chromosomes total) with 
normalization of the green:red fluorescence intensity ratio for the 
entire metaphase and background correction. Chromosome identifi- 
cation was performed based on 4,6-dlamidino-2-phenylindole band- 
ing patterns. Only images showing uniform high intensity fluores- 
cence with minimal background staining were analyzed. All 
centromeres, p arms of acrocentric chromosomes, and heterochro- 
matic regions were excluded from the analysis. 

RESULTS 

Comparative Genomic Hybridization— The CGH analysis 
identified a number of chromosomal gains and losses in the 
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Table I ' 
Correlation between alterations detected by CGH and by expression monitoring 

Top. CGH used as independent variable (if CGH alteration - whal expression ratio was found); bottom, altered expression used as 
independent variable (if expression alteration - what CGH deviation was found). 



CGH alterations 



Tumor 733 vs. 335 
Expression change clusters 



Concordance CGH alterations 



Tumor 827^ vs. 532 
Expression change clusters 



Concordance 



13 Gain 



10 Loss 



10 Up-regulation 

0 Down-regulation 

3 No change 

1 Up-regulation 
5 Down-regulation 

4 No change 

Tumor 733 vs. 335 

Expression change clusters — ; 

CGH alterations 



77% 



50% 



10 Gain 8 Up-regulation 

0 Down-regulation 
2 No change 

12 Loss 3 Up-regulation 

2 Down regulation 
7 No change 



Concordance Expression change dusters 



Tumor €27 vs. 532 
CGH alterations 



80% 
i7% 

Concordance 



16 Up-regulation 
21 Down-regulation 
15 No change 



11 Gain 

2 Loss 

3 No change 
1 Gain 

8 Loss 

12 No change 
3 Gain 

3 Loss 

9 No change 



69% 
38% 
60% 



17 Up-regulation 
9 Down-regulation 
21 No change 



10 Gain 

5 Loss 

2 No change 
OGain 

3 Loss 

6 No change 
1 Gain 

3 Loss 

17 No change 



59% 
33% 
81% 



two Invasive tumors (stage pT1 ( TCCs 733 and 827), whereas 
the two non-invasive papillomas (stage pTa, TCCs 335 and 
532) showed only 9p-, 9q22-q33-, and X-, and 7+, 9q-, 
and respectively. Both invasive tumors showed changes 
(1q22-24+, 2q14.1-qter- ( 3q12-q13.3-, 6q12-q22- f 
9q34+, 11q12-q13+» 17+, and 20q11.2-q12+) that are typ- 
ical for their disease stage, as well as additional alterations, 
some of which are shown in Fig. 1. Areas with gains and 
losses deviated from the normal copy number to some extent, 
and the average numerical deviation from normal was 0.4-fold 
in the case of TCC 733 and 0.3-fold for TCC 827. The largest 
changes, amounting to at least a doubling of chromosomal 
content, were observed at 1q23 in TCC 733 (Fig. 1,4) and 
20q12 in TCC 827 (Fig. 1B). 

mRNA Expression in Relation to DNA Copy Number —The 
mRNA levels from the two Invasive tumors (TCCs 827 and 
733) were compared with the two non-invasive counterparts 
(TCCs 532 and 335). This was done in two separate experi- 
ments in which we compared TCCs 733 to 335 and 827 to 
532, respectively, using two different scaling settings for the 
arrays to rule out scaling as a confounding parameter. Ap- 
proximately 1,800 genes that yielded a signal on the arrays 
were searched in the Unigene and Genemap data bases for 
chromosomal location, and those with a known location 
(1096) were plotted as bars covering their purported locus. In 
that way it was possible to construct a graphic presentation of 
DNA copy number and relative mRNA levels along the indi- 
vidual chromosomes (Fig. 1). 

For each mRNA a ratio was calculated between the level in 
the invasive versus the non-invasive counterpart. Bars, which 
represent chromosomal location of a gene, were color-coded 
according to the expression ratio, and only differences larger 



than 2-fold were regarded as informative (Fig. 1). The density 
of genes along the chromosomes varied, and areas contain- 
ing only one gene were excluded from the calculations: The 
resolution of the CGH method is very low, and some of the 
outlier data may be because of the fact that the boundaries of 
the chromosomal aberrations are not known at high resolution. 

Two sets of calculations were made from the data For the 
first set we used CGH alterations as the independent variable 
and estimated the frequency of expression alterations in these 
chromosomal areas. In general, areas with a strong gain of 
chromosomal material contained a cluster of genes having 
increased mRNA expression. For example, both chromo- 
somes 1q21-q25, 2p and 9q, showed a relative gain of more 
than 100% in DNA copy number that was accompanied by 
increased mRNA expression levels in the two tumor pairs (Fig. 
1). In most cases, chromosomal gains detected by CGH were 
accompanied by an increased level of transcripts in both 
TCCs 733 (77%) and 827 (80%) (Table I, fop). Chromosomal 
losses, on the other hand, were not accompanied by de- 
creased expression in several cases, and were often regis- 
tered as having unaltered RNA levels (Table I, fop). The inabil- 
ity to detect RNA expression changes in these cases was not 
because of fewer genes mapping to the lost regions (data not 
shown). 

In the second set of calculations we selected expression 
alterations above 2-fold as the independent variable and es- 
timated the frequency of CGH alterations in these areas. As 
above, we found that increased transcript expression corre- 
lated with gain of chromosomal material (TCC 733, 69% and 
TCC 827, 59%), whereas reduced expression was often de- 
tected in areas with unaltered CGH ratios (Table I, bottom). 
Furthermore, as a control we looked at areas with no after- 
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Expression changes 
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not detected 
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Tumor 827 versus 532 



Tumor 733 versus 335 



Rg. 2. Correlation between maximum CGH aberration and the ability to detect expression change by oligonucleotide array 
monitoring. The aberration is shown as a numerical -fold change in ratio between invasive tumors 827 (A) and 733 {♦) and their non-invasive 
counterparts 532 and 335. The expression change was taken from the Expression line to the right In Rg. 1, which depicts the resulting 
expression change for a given chromosomal region. At least half of the mRNAs from a given region have to be either up- or down-regulated 
to be scored as an expression change. All chromosomal arms In which the CGH ratio plus or minus one standard deviation was outside the 
ratio value of one were included. 



ation In expression. No alteration was detected by CGH In 
most of these areas (TCC 733, 60% and TCC 827, 81 %; see 
Table I, bottom). Because the ability to observe reduced or 
Increased mRNA expression clustering to a certain chromo- 
somal area clearly reflected the extent of copy number 
changes, we plotted the maximum CGH aberrations in the 
regions showing CGH changes against the ability to detect a 
change in mRNA expression as monitored by the oligonucleo- 
tide arrays (Rg. 2)(£pr both tumors TCC 733 (p < 0.015) and 
TCC 827 (p < 0.00003) a highly significant correlation was 
observed between the level of CGH ratio change (reflecting 
the DNA copy number) and alterations detected by the array 
based technology (Fig. 2£ Similar data were obtained when 
areas with altered expression were used as independent vari- 
ables. These areas correlated best with CGH when the CGH 
ratio deviated 1 .6- to 2.0-fold (T able I, bottom) but mostly did 
not at lower CGH deviations. These data probably reflect that 
loss of an allele may only lead to a 50% reduction in expres- 
sion level, which is at the cut-off point for detection of expres- 
sion alterations. Gain of chromosomal material can occur to a 
much larger extent. 

Microsateltite-based Detection of Minor Areas of Loss- 
es—In TCC 733, several chromosomal areas exhibiting DNA 
amplification were preceded or followed by areas with a nor- 
mal CGH but reduced mRNA expression (see Fig. 1, TCC 733 
chromosome 1q32, 2p21, and 7q21 and q32, 9q34, and 
10q22). To determine whether these results were because of 
undetected loss of chromosomal material in these regions or 



because of other non-structural mechanisms regulating tran- 
scription, we examined two mlcrosatellites positioned at chro- 
mosome 1q25-32 and two at chromosome 2p22. Loss of 
heterozygosity (LOH) was found at both 1q25 and at 2p22 
indicating that minor deleted areas were not detected with the 
resolution of CGH (Rg. 3). Additionally, chromosome 2p in 
TCC 733 showed a CGH pattern of gain/no change/gain of 
DNA that correlated with transcript increase/decrease/in- 
crease. Thus, for the areas showing increased expression 
there was a correlation with the DNA copy number alterations 
(Rg. 1,A). As indicated above, the mRNA decrease observed In 
the middle of the chromosomal gain was because of LOH, 
implying that one of the mechanisms for mRNA down-regu- 
lation may be regions that have undergone smaller losses of 
chromosomal material. However, this cannot be detected with 
the resolution of the CGH method. 

In both TCC 733 and TCC 827, the telomeric end of chro- 
mosome 11p showed a normal ratio in the CGH analysis; 
however, clusters of five and three genes, respectively, lost 
their expression. Two microsatellites (D11S1760, D11S922) 
positioned close to MUC2, IGF2, and cathepsin D indicated 
LOH as the most likely mechanism behind the loss of expres- 
sion (data not shown). 

A reduced expression of mRNA observed in TCC 733 at 
chromosomes 3q24, 11p11, 12p12.2, 12q21.1, and 16q24 
and in TCC 827 at chromosome 11p15.5, 12p11, 15q11.2, 
and 18q12 was also examined for chromosomal losses using 
microsatellites positioned as close as possible to the gene loci 
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FtG. 3. Microsatelirte analysis of loss of heterozygosity. Tumor 
733 showing loss of heterozygosity at chromosome 1q25, detected 
(a) by D1S215 close to Hu class I histocompatibility antigen (gene 
number 38 in Rg. 1), (b) by D1S2735 close to cathepsln E (gene 
number 41 in Rg. 1), and (c) at chromosome 2p23 by D2S2251 close 
to general ^-spectrin (gene number 1 1 on Rg. 1) and of (d) tumor 827 
showing loss of heterozygosity at chromosome 18q12 by S18S1118 
close to mitochondrial 3-oxoacyl-coenzyme A thiolase (gene number 
12 in Rg. 1). The upper curves show the electropherogram obtained 
from normal DNA from leukocytes (/v), and the tower curves show the 
electropherogram from tumor DNA (7). In ail cases one allele is 
partially lost in the tumor amplicon. 

showing reduced mRNA transcripts. Only the microsatellite 
positioned at 18q12 showed LOH (Fig. 3), suggesting that 
transcriptional down-regulation of genes In the other regions 
may be controlled by other mechanisms. 

Relation between Changes in mRNA and Protein Levels— 
2D-PAGE analysis, in combination with Coomassie Brilliant 
Blue and/or silver staining, was carried out on all four tumors 
using fresh biopsy material. 40 well resolved abundant known 
proteins migrating in areas away from the edges of the pH 
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Fig. 4. Correlation between protein levels as judged by 20- 
PAGE and transcript ratio. For comparison proteins were divided in 
three groups, unaltered In level or up- or down-regulated (no/feonfa/ 
axis). The mRNA ratio as determined by oligonucleotide arrays was 
plotted for each gene (vertical axis). A, mRNAs that were scored as 
present In both tumors used for the ratio calculation; A, mRNAs that 
were scored as abkent in the invasive tumors (along horizontal axis) or 
as absent in non'-invasive reference (fop of figure). Two different 
scalings were used to exclude scaling as a confounder, TCCs 827 
and 532 (AA) were scaled with background suppression, and TCCs 
733 and 335 (#0) were scaled without suppression. Both compari- 
sons showedhighly significant (p < 0.0O5) differences in mRNA ratios 
between the groups. Proteins shown were as follows: Group A (from 
/efl), phosphoglucomutase 1 , glutathione transferase class m number 
4, fatty acid-binding protein homologue. cytokeratln 15, and cyto- 
keratln 13; 8 (from left), fatty acid-binding protein homologue, 28-kOa 
heat shock protein, cytokeratln 13. and calcyclin; C<from/eft), a-eno- 
lase, hnRNP B1, 28-kDa heat shock protein, 14-3-3-e, and 
pre-mRNA splicing factor; D, mesothelial keratin K7 (type II); E (from 
fop), glutathione S-transferase-7r and mesothelial keratin K7 (type II); 
F(from top and /eft), adenylyl cyciase-associated protein, E-cadherin! 
keratin 19, calgtzzarln, phosphoglycerate mutase, annexJn'lV, cy- 
toskeletal y-actin, hnRNP A1. integral membrane protein calnexin 
(IP90), hnRNP H, brain-type clathrin light chain-a, hnRNP F, 70-kDa 
heat shock protein, heterogeneous nuclear ribonucleoprotein A/8, 
translationally controlled tumor protein, liver glyceraldehyde-3-phos^ 
phate dehydrogenase, keratin 8, aldehyde reductase, and Na,K- 
ATPase 0-1 subunit; G, (from top and teff), TCP20, calgizzarin, 70- 
kDa heat shock protein, calnexin. hnRNP H, cytokeratln 15, ATP 
synthase, keratin 1 9, triosephosphate isomerase, hnRNP F, liver glyc- 
eraldehyde-3-phosphatase dehydrogenase, glutathione S-transfer- 
ase-ir, and keratin 8; H (from left), plasma gelsolln. autoantigen cal- 
reticulin, thioredoxin. and NAD-f-dependent 15hydroxyprostaglandin 
dehydrogenase; I (from top), prolyl 4-hydroxylase /3-subunit, cyto- 
keratin 20, cytokeratln 17, prohibition, and fructose 1,6-biphos- 
phatase; J annexln II; K, annexin IV; L {from top and terT). 90-kDa heat 
shock protein, prolyl 4-hydroxylase /3-subunit, a-enolase, GRP 78, 
cyclophilin. and cofilln. 

gradient, and having a known chromosomal location, were 
selected for analysis in the TCC pair 827/532. Proteins were 
identified by a combination of methods (see "Experimental 
Procedures"). In general there was a highly significant corre- 
lation (p < 0.005) between mRNA and protein alterations (Fig. 
4). Only one gene showed disagreement between transcript 
alteration and protein alteration. Except for a group of cyto- 
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Fig. 5. Comparison of protein and transcript levels In invasive 
and non-invasive TCCs. The upper part of the figure shows a 2D gel 
(teft) and the oligonucleotide array (rfgnf) of TCC 532. The red rectan- 
gles on the upper gel highlight the areas that are compared below. 
Identical areas of 2D gels of TCCs 532 and 827 are shown below. 
Clearly, cytokeratins 13 and 15 are strongly down-regulated In TCC 
827 (red annotation). The tile on the array containing probes for 
cytokeratin 15 is enlarged below the array (red arrow) from TCC 532 
and is compared with TCC 827. The upper row of squares in each tile 
corresponds to perfect match probes; the tower row corresponds to 
mismatch probes containing a mutation (used for correction for un- 
speclftc binding). Absence of signal Is depicted as black, and the 
higher the signal the lighter the color. A high transcript level was 
detected in TCC 532 (6151 units) whereas a much lower level was 
detected in TCC 827 (absence of signals). For cytokeratin 13, a high 
transcript level was also present In TCC 532 (15659 units), and a 
much lower level was present in TCC 827 (623 units). The 2D gels at 
the bottom of the figure (left) show levels of PA-FABP and adipocyte-' 
FABP in TCCs 335 and 733 (invasive), respectively. Both proteins are 
down-regulated in the invasive tumor. To the right we show the array 
tiles for the PA-FABP transcript. A medium transcript level was de- 
tected in the case of TCC 335 (1277 units) whereas very low levels 
were detected In TCC 733 (166 units). IEF, isoelectric focusing. 



keratins encoded by genes on chromosome 17 (Fig. 5) the 
analyzed proteins did not belong to a particular family. 26 well 
focused proteins whose genes had a know chromosomal 
location were detected in TCCs 733 and 335. and of these 19 
correlated (p < 0.005) with the mRM A changes detected using 
the arrays . (Fig. 4). For example. PA-FABP was highly ex- 
pressed in the non-invasive TCC 335 but lost in the Invasive 
counterpart (TCC 733; see Fig. 5). The smaller number of 
proteins detected in both 733 and 335 was because of the 
smaller size of the biopsies that were available. 

1 1 chromosomal regions where CGH showed aberrations 
that corresponded to the changes in transcript levels also 
showed corresponding changes in the protein level (Table II). 
These regions included genes that encode proteins that are 
found to be frequently altered in bladder cancer, namely 
cytokeratins 17 and 20, annexins II and IV, and the fatty 
acid-binding proteins PA-FABP and FBP1. Four of these pro- 
teins were encoded by genes in chromosome 17q, a fre- 
quently amplified chromosomal area in invasive bladder 
cancers. 

DISCUSSION 

Most human cancers have abnormal DNA content, having 
lost some chromosomal parts and gained others. The present 
study provides some evidence as to the effect of these gains 
and losses on gene expression in two pairs of non-Invasive 
and invasive TCCs using high throughput expression arrays 
and proteomics, in combination with CGH. In general, the 
results showed that there is a clear individual regulation of the 
mRNA expression of single genes, which in some cases was 
superimposed by a DNA copy number effect. In most cases, 
genes located in chromosomal areas with gains often exhib- 
ited increased mRNA expression, whereas areas showing 
losses showed either no change or a reduced mRNA expres- 
sion. The latter might be because of the fact that losses most 
often are restricted to loss of one allele, and the cut-off point 
for detection of expression alterations was a 2-fold change, 
thus being at the border of detection. In several cases, how- 



Table ii 

Proteins whose expression level correlates with both mRNA and gene dose changes 



Protein 


Chromosomal location 


Tumor TCC 


CGH alteration 


Transcript alteration* 


Protein alteration 


Annexin II 


1q21 


733 


Gain 


Abs to Pres* 


Increase 


Annexin IV * 


2p13 


733 


Gain 


3.9-Fold up 


Increase 


Cytokeratin 17 


17q12-q21 


827 


Gain 


3.8-Fold up 


Increase 


Cytokeratin 20 


17q21.1 


827 


Gain 


5.6-Foid up 


Increase 


(PA-)FABP 


8q21.2 


827 


Loss 


10-Fold down 


Decrease 


FBP1 


9q22 


827 


Gain 


2.3-Fold up 


Increase 


Plasma gelsolin 


9q31 


827 


Gain 


Abs to Pres 


Increase 


Heat shock protein 28 


15q12-q13 


827 


Loss 


2.5-Fold up 


Decrease 


Prohibitin 


17q21 


827/733 


Gain 


3.7-/2.5-Fold up* 


Increase 


Prolyl-4-hydroxy! 


17q25 


827/733 


Gain 


5.7-/1 .6-Fold up 


Increase 


hnRNPBI 


7p15 


827 , 


Loss 


2.5-Fold down 


Decrease 



* Abs. absent; Pres. present. 

b In cases where the corresponding alterations were found in both TCCs 827 and 733 these are shown as 827/733. 
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ever, an increase or decrease in DNA copy number was 
associated with de novo occurrence or complete loss of tran- 
script, respectively. Some of these transcripts could not be 
detected in the non-invasive tumor but were present at rela- 
tively high levels In areas with DNA amplifications in the inva- 
sive tumors (e.g. in TCC 733 transcript from cellular llgand of 
annexin 11 gene (chromosome 1q21) from absent to 2670 
arbitrary units; in TCC 827 transcript from small proline-rich 
protein 1 gene (chromosome 1q12-q21.1) from absent to 
1326 arbitrary units). It may be anticipated from these data 
that significant clustering of genes with an increased expres- 
sion to a certain chromosomal area indicates an increased 
likelihood of gain of chromosomal material in this area. 

Considering the many possible regulatory mechanisms act- 
ing at the level of transcription, it seems striking that the gene 
dose effects were so clearly detectable in gained areas. One 
hypothetical explanation may lie in the loss of controlled 
methylation In tumor cells (17-19). Thus, it may be possible 
that in chromosomes with increased DNA copy numbers two 
or more alleles could be demethylated simultaneously leading 
to a higher transcription level, whereas in chromosomes with 
losses the remaining allele could be partly methylated, turning 
off the process (20, 21). A recent report has documented a 
ploidy regulation of gene expression in yeast, but In this case all 
the genes were present in the same ratio (22), a situation that is 
not analogous to that of cancer cells, which show marked 
chromosomal aberrations, as well as gene dosage effects. 

Several CGH studies , of bladder cancer have shown that 
some chromosomal aberrations are common at certain 
stages of disease progression, often occurring in more than 1 
of 3 tumors. In pTa tumors, these include 9p- t 9q-, 1q+, Y- 
(2, 6), and In pT1 tumors, 2q- ( 11p-. Hq-, 1q+. 5p+, 8q+, 
17q,+, and 20q+ (2-4, 6, 7). The pTa tumors studied here 
showed similar aberrations such as 9p- and 9q22-q33- and 
9q- and Y-, respectively. Likewise, the two minimal invasive 
pT1 tumors showed aberrations that are commonly seen at 
that stage, and TCC 827 had a remarkable resemblance to the 
commonly seen pattern of losses and gains, such as 1 q22-24 
amplification (seen in both tumors), 1 1q14-q22 loss, the latter 
often linked to 17 q+ (both tumors), and 1q+ and 9p-, often 
linked to 20q+ and 11 q1 3+ (both tumors) (7-9). These ob- 
servations indicate that the pairs of tumors used in this study 
exhibit chromosomal changes observed in many tumors, and 
therefore the findings could be of general importance for 
bladder cancer. 

Considering that the mapping resolution of CGH is of about 
20 megabases it Is only possible to get a crude picture of 
chromosomal instability using this technique. Occasionally, 
we observed reduced transcript levels close to or inside re- 
gions with increased copy numbers. Analysis of these regions 
by positioning heterozygous mfcrosatellites as close as pos- 
sible to the locus showing reduced gene expression revealed 
loss of heterozygosity in several cases. It seems likely that 
multiple and different events occur along each chromosomal 



arm and that the use of cDNA mlcroarrays for analysis of DNA 
copy number changes will reach a resolution that can resolve 
these changes, as has recently been proposed (2). The outlier 
data were not more frequent at the boundaries of the CGH 
aberrations. At present we do not know the mechanism be- 
hind chromosomal aneupioidy and cannot predict whether 
chromosomal gains will be transcribed to a larger extent than 
the two native alleles. A mechanism as genetic imprinting has. 
an impact on the expression level in normal cells and Is often 
reduced in tumors. However, the relation between imprinting 
and gain of chromosomal material is not known. 

We regard It as a strength of this investigation that we were 
able to compare invasive tumors to benign tumors rather than 
to normal urothelium, as the tumors studied were biologically 
very close and probably may represent successive steps In 
the progression of bladder cancer. Despite the limited amount 
of fresh tissue available It was possible to apply three different 
state of the art methods. The observed correlation between 
DNA copy number and mRNA expression is remarkable when 
one considers that different pieces of the tumor biopsies were 
used for the different sets of experiments. This indicate that 
bladder tumors are relatively homogenous, a notion recently 
supported by CGH and LOH data that showed a remarkable 
similarity even between tumors and distant metastasis (10, 23). 

In the few cases analyzed, mRNA and protein levels 
showed a striking correspondence although In some cases 
we found discrepancies that may be attributed to translational 
regulation, post-translational processing, protein degrada- 
tion, or a combination of these. Some transcripts belong to 
undertranslated mRNA pools, which are associated with few 
translationally inactive ribosorries; these pools, however, 
seem to be rare (24). Protein degradation, for example, may 
■ be very Important in the case of polypeptides with a short 
half-life {e.g. signaling proteins). A poor correlation between 
mRNA and protein levels was found in liver cells as deter- 
mined by arrays and 2D-PAGE (25), and a moderate correla- 
tion was recently reported by Ideker er a/. (26) In yeast, 
c^lnterestlngly, our study revealed a much better correlation 
between gained chromosomal areas and increased mRNA 
levels than between loss of chromosomal areas and reduced 
mRNA levels. In general, the level of CGH change determined 
the ability to detect a change in transcripC) One possible 
explanation could be that by losing one allele the change In 
mRNA level is not so dramatic as compared with gain of 
material, which can be rather unlimited and may lead to a 
severalfold increase in gene copy number resulting in a much 
higher Impact on transcript level. The latter would be much 
easier to detect on the expression arrays as the cut-off point 
was placed at a 2-fold level so as not to be biased by noise on 
the array. Construction of arrays with a better signal to noise 
ratio may in the future allow detection of \esser than 2-fold 
alterations in transcript levels, a feature that may facilitate the 
analysis of the effect of loss of chromosomal areas on tran- 
script levels. 
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In eleven cases we found a significant correlation between 
DNA copy number, mRNA expression, and protein level. Four 
of these proteins were encoded by genes located at a fre- 
quently amplified area in chromosome 17q. Whether DNA 
copy number is one of the mechanisms behind alteration of 
these eleven proteins is at present unknown and will have to 
be proved by other methods using a larger number of sam- 
ples. One factor making such studies complicated is the large 
extent of protein modification that occurs after translation, 
requiring ImmunoidentiflcatJon and/or mass spectrometry to 
correctly identify the proteins in the gels. 

In conclusion, the results presented In this study exemplify 
the large body of knowledge that may be possible to gather in 
the future by combining state of the art techniques that follow 
the pathway from DNA to protein (26). Here, we used a tradi- 
tional chromosomal CGH method, but in the future high reso- 
lution CGH based on microarrays with many thousand radiation 
hybrid-mapped genes will increase the resolution and informa- 
tion derived from these types of experiments (2). Combined with 
expression arrays analyzing transcripts derived from genes with 
known locations, and 2D gel analysis to obtain information at 
the post-translational level, a clearer and more developed un- 
derstanding of the tumor genome will be forthcoming. 
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ABSTRACT 

Genetic changes underlie tumor progression and may lead to cancer- 
specific expression of critical genes. Over 1100 publications have de- 
scribed the use of comparative genomic hybridization (CGH) to analyze 
the pattern of copy number alterations in cancer, but very few of the genes 
affected arc known. Here, we performed high -re so lotion CGH analysis on 
cDNA micro arrays in breast cancer and directly compared copy number 
and mRNA expression levels of 13,824 genes to quajititate the impact of 
genomic changes on gene expression. We identified and mapped the 
boundaries of 24 independent amplicons, ranging in size from 0.2 to 12 
Mb. Throughout the genome, both high- and low-level copy number 
changes had a substantial impact on gene expression, with 44% of the 
highly amplified genes showing overexpression and 105% of the highly 
overexpressed genes being amplified. Statistical analysis with random 
permutation tests identified 270 genes whose expression levels across 14 
samples were systematically attributable to gene amplification. These 
included most previously described amplified genes in breast cancer and 
many novel targets for genomic alterations, including the HOXJB7 gene, 
the presence of which in a novel amplicon at 17q2i J was validated in 
10.2% of primary breast cancers and associated with poor patient prog- 
nosis. In conclusion, CGH on cDNA microarrays revealed hundreds of 
novel genes whose overexpression is attributable to gene amplification. 
These genes may provide insights to the clonal evolution and progression 
of breast cancer and highlight promising therapeutic targets. 

INTRODUCTION 

Gene expression patterns revealed by cDNA microarrays have 
facilitated classification of cancers into biologically distinct catego- 
ries, some of which may explain the clinical behavior of the tumors 
(1-6). Despite this progress in diagnostic classification, the molecular 
mechanisms underlying gene expression patterns in cancer have re- 
mained elusive, and the utility of gene expression profiling in the 
identification of specific therapeutic targets remains limited. 

Accumulation of genetic defects is thought to underlie the clonal 
evolution of cancer. Identification of the genes that mediate the effects 
of genetic changes may be important by highlighting transcripts that 
are actively involved in tumor progression. Such transcripts and their 
encoded proteins would be ideal targets for anticancer therapies, as 
demonstrated by the clinical success of new therapies against ampli- 
fied oncogenes, such as ERBB2 and EGFR (7, 8), in breast cancer and 
other solid rumors. Besides amplifications of known oncogenes, over 
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Fig. 1. Impact of gene copy number on global gene expression levels. A. percentage of 
over* and undercxprcsscd genes (Y axis) according to copy number ratios (X axis). 
Threshold values used for over- and undcrexpression were >2.184 (global upper 7% of 
the cDNA ratios) and <0.4826 (global lower 7% of the expression ratios). B. percentage 
of amplified and deleted genes according to expression ratios. Threshold values for 
amplification and deletion were >t J and <0.7. 



20 recurrent regions of DNA amplification have been mapped in 
breast cancer by CGH 5 (9, 10). However, these amplicons are often 
large and poorly defined, and their impact on gene expression remains 
unknown. 

We hypothesized that genome-wide identification of those gene 
expression changes that are attributable to underlying gene copy 
number alterations would highlight transcripts that are actively in- 
volved in the causation or maintenance of the malignant phenotype. 
To identify such transcripts, we applied a combination of cDNA and 
CGH microarrays to: (a) determine the global impact that gene copy 
number variation plays in breast cancer development and progression; 
and {b) identify and characterize those genes whose mRNA expres- 



5 The abbreviations used are: CGH, comparative genomic hybridization; FISH, fluo- 
rescence in situ hybridization; RT-PCR, reverse transcription-PCR. 
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p; - GcMIBCJwM e copy number and expression analysis in the MCF-7 breast cancer cell line. A. chromosomal CGH analysis of MCF-7. The copy number _nt» profile (blue 
H^M^t^ f^nlP tctomcTc to Xq telomere is shown olon 8 with * I SD ^orange Unes). The black HoHtontat tine 



TZin*nf"\ 7 B^C iienome-wide copy number analysis in MCF-7 by CGH on cDNA microarray. The copy number ratios were plotted as a function of the position 

^pomts arc connected with a line, and a moving median of 10 adjacent done. ; is ; shown. " 
of to cDNA 7" 8 , n c ^dividual data points arc labeled by color coding according to cDN A expression ratios. The bright red dots indicate the upper 2%. and dark red dots. 
copy number ^^'^"^ ^« ^ MctTccIU (overexposed genes); bright green dots indicate the lowtst 2%, and dark green dots, the next 5% of the expression ratio, 
the next 5% of the expression ratios in MCr 7 cells <overcxpressca g g^ ^ arekhown at the bottom of the figure, and chromosome boundaries arc 
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(undercxprcssed genes); the rest of the observations are shown with black crosses. 
indicated with a dashed line. 



sion is most significantly associated with amplification of the corre- 
sponding genomic template. 

MATERIALS AND METHODS 

Breast Cancer Cell Lines. Fourteen breast cancer cell lines (BT-20, BT- 
474, HCC1428, Hs578t, MCF7. MDA-361, MDA-436, MDA-453, MDA-468, 
SKBR-3, T-47D, UACC812. ZR-75-1, and ZR-75-30) were obtained from the 
American Type Culture Collection (Manassas, VA), Cells were grown under 
recommended culture conditions. Genomic DNA and mRNA were isolated 
using standard protocols. 

Copy Number and Expression Analyses by cDNA Microarrays. The 
preparation and printing of the 13,824 cDNA clones on glass slides were 
performed as described (1 1-13). Of these clones, 244 represented uncharac- 
terized expressed sequence tags, and the remainder corresponded to known 
genes. CGH experiments on cDNA microarrays were done as described (14, 
15). Briefly, 20 jtg of genomic DNA from breast cancer cell lines and normal 
human WBCs were digested for 14-18 h with Alu\ and Rsa\ (Life Technol- 
ogies, Inc., Rockville, MD) and purified by phenol/chloroform extraction. Six 
jxg of" digested cell line DNAs were labeled with Cy3-dUTP (Amersham 
Pharmacia) and normal DNA with Cy5-dUTP (Amersham Pharmacia) using 
the Bioprime Labeling kit (Life Technologies, Inc.). Hybridization ( 14, 15) and 
posthybridization washes (13) were done as described. For the expression 
analyses, a standard reference (Universal Human Reference RNA; Stratagene, 
La Jolla.CA) was used in all experiments. Forty ng of reference RNA were 
labeled with Cy3-dUTP and 3.5 |Ag of test mRNA with Cy5-dUTP, and the 
labeled cDNAs were hybridized on microarrays as described (13,15). For both 
microarray analyses, a laser confoca! scanner (Agilent Technologies, Palo 
Alio, CA) was used to measure the fluorescence intensities at the target 
locations using the DE ARRAY software (16). After background subtraction, 
average intensities at each clone in the test hybridization were divided by ihe 
average intensity of the corresponding clone in the control hybridization. For 
the copy number analysis, the ratios were normalized on the basis of the 
distribution of ratios of all targets on the array and for the expression analysis 
on the basis of 88 housekeeping genes, which were spotted four times onto the 
array. Low quality measurements {I.e.. copy number data with mean reference 
intensity <100 fluorescent units, and expression data with both test and 
reference intensity <100 fluorescent units and'or with spot size <50 units) 



were excluded from the analysis and were treated as missing values. The 
distributions of fluorescence ratios were used to define cutpoints for increased/ 
decreased copy number. Genes with CGH ratio > 1 .43 (representing the upper 
5% of the CGH ratios across all experiments) were considered to be amplified, 
and genes with ratio <0.73 (representing the lower 5%) were considered to be 
deleted. 

Statistical Analysis of CGH and cDNA Microarray Data. To evaluate 
the influence of copy number alterations on gene expression, we applied the 
following statistical approach. CGH and cDNA calibrated intensity ratios were 
log-transformed and normalized using median centering of the values in each 
cell line. Furthermore, cDNA ratios for each gene across all 14 cell lines were 
median centered. For each gene, the CGH data were represented by a vector 
that was labeled 1 for amplification (ratio, > 1.43) and 0 for no amplification. 
Amplification was correlated with gene expression using the signal-to-noise 
statistics (I). We calculated a weight, for each gene as follows: 

m »i " m^ 



4 °V + 

where m gU <r ffI and m^, denote the means and SDs for the expression 
levels for amplified and nonamplifled cell lines, respectively To assess the 
statistical significance of each weight, we performed 10,000 random permu- 
tations of the label vector. The probability that a gene had a larger or equal 
weight by random permutation than the original weight was denoted by a. A 
low a (<0.05) indicates a strong association between gene expression and 
amplification. 

Genomic Localization of cDNA Clones and Amplicon Mapping. Each 
cDNA clone on the microarray was assigned to a Unigcnc cluster using the 
Unigene Build 141 * A database of genomic sequence alignment information 
for mRNA sequences was created from the August 2001 freeze of the Uni- 
versity of California Santa Cruz's GoldenPath database. 7 The chromosome and 
bp positions for each cDNA clone were then retrieved by relating these data 
sets. Amplicons were defined as a CGH copy number ratio >2.0 in at least two 
adjacent clones in two or more cell lines or a CGH ratio >2.0 in at least three 
adjacent clones in a single cell line. The amplicon start and end positions were 



* Internet address: http//r(^arch.nhsri.nfo.gov/nu 
7 Internet address: www.gcnomc.ucsc.edu. 
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Table 1 Summary of independent amplicons in 14 breast cancer celt lines by 
CGH mic roar ray 



Location 



1 P 13 
IqZl 
lq22 
3pt4 

7pl2.I~7pll.2 

7q31 

7q32 

8q2UI^8q2U3 
8q2l.3 

8q23.3-*q24.14 

8q24.22 

9pl3 

I3q22-q31' 

16q22 

I7qil 

I7ql2-q2l.2 

17q2!J2-q2l.33 

17q22-q23.3 

I7q23.3-q24:3 

I9ql3 

20qtl.22 

20ql3.12 

2Dql3.l2--ql3.l3 

20ql3.2-ql3.32 



Stan (Mb) 


End (Mb) 


Size (Mb) 


ill TO 




0.2 


173.92 


177 7< 


3,3 


1 79.28 


no <7 


0.3 


71,94 


74.66 


2.7 


55.62 


in Q< 

ou.y? 


5.3 


1 25.73 


tin qa 


5.2 


140.01 


140 68 


0.7 


86.45 




6.0 




103 05 


4.6 


■ "Ml SB 




12 j 


151.21 


152.16 


1.0 


38.65 


39.25 


0.6 


77.15 


81.38 


4.2 


86.70 


87.62 


0.9 


29.30 


30.85 


1.6 


39.79 


42.80 


3.0 


52.47 


55.80 


3.3 


63.81 


69.70 


5.9 


69.93 


74.99 


5.1 


40.63 


41.40 


0.8 


34.59 


35.85 


1.3 


44.00 


45.62 


1.6 


46.45 


49.43 


3.0 


51.32 


59.12 


7.8 



CGH were validated, with lq21, 17qlZ— q2l.2, 17q22-q23, 20ql3.1, 
and 20ql3.2 regions being most commonly amplified. Furthermore, 
the boundaries of these amplicons were precisely delineated. In ad- 
dition, novel amplicons were identified at 9pl3 (38.65-39.25 Mb), 
and 17q21.3 (52.47-55.80 Mb). 

Direct Identification of Putative Amplification Target Genes. 
The cDNA/CGH microarray technique enables the direct correla- 
tion of copy number and expression <data on a gene-by-gene basis 
throughout the genome. We directly annotated high-resolution 
CGH plots with gene expression data using color coding. Fig. 2C 
shows that most of the amplified genes in the MCF-7 breast cancer 
cell line at lp!3, 17q22-q23, and 20ql3 were highly overex- 
pressed. A view of chromosome 7 in the MDA-468 cell line 
implicates EGFR as the most highly overexpressed and amplified 
gene at 7pl l-pl2 (Fig. 3A). In BT-474, the two known amplicons 
at I7ql2 and 17q22-q23 contained numerous highly overex- 
pressed genes (Fig. 35). In addition, several genes, including the 
homeobox genes HOXB2 and HOXB 7, were highly amplified in a 
previously undescribed independent amplicon at 17q21.3. HOXB7 
was systematically amplified (as validated by FISH, Fig. 3B, inset) 
as well as overexpressed (as verified by RT-PCR, data not shown) 
in BT-474, UACC812, and ZR-75-30 cells. Furthermore, this novel 



extended to include neighboring nonamplificd clones (ratio, <1.5). The am- 
plicon size determination was partially dependent on local clone density. 

FISH. Dual-color interphase FISH to breast cancer cell lines was done as 
described (17). Bacterial artificial chromosome clone RP11-361K8 was la- 
beled with SpectrumOrange (Vysis, Downers Grove, IL), and Spectrum- 
Orange-labeled probe for EGFR was obtained from Vysis. SpectrumGreen- 
labelcd chromosome 7 and 17 centromere probes (Vysis) were used as a 
reference. A tissue microarray containing 612 formalin-fixed, paraffin-embed- 
ded primary breast cancers (17) was applied in FISH analyses as described 
(1 8). The use of these specimens was approved by the Ethics Committee of the 
University of Basel and by the NIH. Specimens containing a 2-fold or higher 
increase in the number of test probe signals, as compared with corresponding 
centromere signals, in at least 10% of the tumor cells were considered to be 
amplified. Survival analysis was performed using the Kaplan-Meier method 
and the log-rank test. 

RT-PCR. The HOXB7 expression level was determined relative to 
GAPDH. Reverse transcription and PCR amplification were performed using 
Access RT-PCR System (Promcga Corp., Madison, Wl) with 10 ng of mRNA 
as a template. HOXB7 primers were 5'-GAGCAGAGGGACTCGGACTT-3' 
and 5'-GCGTCAGGTAGCGATTGTAG-3'. 

RESULTS 

Global Effect of Copy Number on Gene Expression. 13,824 
arrayed cDNA clones were applied for analysis of gene expression 
and gene copy number (CGH raicroarrays) in 14 breast cancer cell 
lines. The results illustrate a considerable influence of copy number 
on gene expression patterns. Up to 44% of the highly amplified 
transcripts (CGH ratio, >2.5) were overexpressed (i.e., belonged to 
the global upper 7% of expression ratios), compared with only 6% for 
genes with normal copy number levels (Fig. 1 A). Conversely, 10.5% 
of the transcripts with high-level expression (cDNA ratio, >10) 
showed increased copy number (Fig. \B). Low-level copy number 
increases and decreases were also associated with similar, although 
less dramatic, outcomes on gene expression (Fig. 1). 

Identification of Distinct Breast Cancer Amplicons. Base-pair 
locations obtained for 1 1,994 cDNAs (86.8%) were used to plot copy 
number changes as a function of genomic position (Fig. 2, Supple- 
ment Fig. A). The average spacing of clones throughout the genome 
was 267 kb. This high-resolution mapping identified 24 independent 
breast cancer amplicons, spanning from 0.2 to 12 Mb of DNA (Table 
1). Several amplification sites detected previously by chromosomal 
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Fig. 3. Annotation of gene expression data on CGH microarray profiles. A, genes in the 
7pl 1 -pi 2 amplicon in the MDA-468 cell line are highly expressed (red dots) and include 
the EGFR oncogene. B, several genes in the 17q]2, 17q21.3, and 17q23 amplicons io the 
BT-474 breast caucer cell line are highly overexpressed (red) and include the HOXB7 
genc. The data labels and color coding arc as indicated for Fig. 2C. Injcls show 
chromosomal CGH profiles for the corresponding chromosomes and validation of the 
increased copy number by interphase FISH using EGFR (red) and chromosome 7 
centromere probe {green) to MDA-468 (A) and HOXB7-$pecific probe (red) and chro- 
mosome 1 7 centromere (green) to BT-474 ceils {B). 
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amplification was validated to be present in 10.2% of 363 primary 
breast cancers by FISH to a tissue microarray and was associated 
with poor prognosis of the patients (P = 0.001). 

Statistical Identification and Characterization of 270 Highly 
Expressed Genes in ArapHcons. Statistical comparison of expres- 
sion levels of all genes as a function of gene amplification identified 
270 genes whose expression was significantly influenced by copy 
number across all 14 cell lines (Fig. 4, Supplemental Fig. B). Accord- 
ing to the gene ontology data, 8 91 of the 270 genes represented 
hypothetical proteins or genes with no functional annotation, whereas 
179 had associated functional information available. Of these, 151 
(84%) are implicated in apoptosis, cell proliferation, signal transduc- 
tion, and transcription, whereas 28 (16%) had functional annotations 
that'could not be directly linked with cancer. 



DISCUSSION 

- The importance of recurrent gene and chromosome copy number 
changes in the development and progression of solid tumors has been 
characterized in >1000 publications applying CGH 9 (9, 10), as well 
as in a large number of other molecular cytogenetic, cytogenetic, and 
molecular genetic studies. The effects of these somatic genetic 
changes on gene expression levels have remained largely unknown, 
although a few studies have explored gene expression changes occur- 
ring in specific amplicons (15, 19-21). Here, we applied genome- 
wide cDNA microarray s to identify transcripts whose expression 
changes were attributable to underlying gene copy number alterations 
in breast cancer. 

The overall impact of copy number on gene expression patterns was 
substantial with the most dramatic effects seen in the case of high- 



* Internet address: hirpy; www.gcneontology.orp'. 



° Internet address: http://www.ncbi.nlm. nih.gov/cntrc3L 
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level copy number increase. Low-level copy number gains and losses 
also had a significant influence on expression levels of genes in the 
regions affected, but these effects were more subtle on a gene-by-gene 
basis than those of high-level amplifications. However, the impact of 
low-level gains on the dysregulation of gene expression patterns in 
cancer may be equally important if not more important than that of 
high-level amplifications. Aneuploidy and low-level gains and losses 
of chromosomal arms represent the most common types of genetic 
alterations in breast and other cancers and, therefore, have an influ- 
ence on many genes. Our results in breast cancer extend the recent 
studies on the impact of aneuploidy on global gene expression pat- 
terns in yeast cells, acute myeloid leukemia, and a prostate cancer 
model system (22-24j. 

The CGH microarray analysis identified 24 independent breast 
cancer amplicons. We defined the precise boundaries for many am- 
plicons detected previously by chromosomal CGH (9, 10, 25, 26) and 
also discovered novel amplicons that had not been detected previ- 
ously, presumably because of their small size (only 1-2 Mb) or close 
proximity to other larger amplicons. One of these novel amplicons 
involved the bomeobox gene region at I7q21.3 and led to the over- 
expression of the HOXB7 and HOXB2 genes. The homeodomain 
transcription factors are known to be key regulators of embryonic 
development and have been occasionally reported to undergo aberrant 
expression in cancer (27, 28). HOXB7 transfection induced cell pro- 
liferation in melanoma, breast, and ovarian cancer cells and increased 
tumorigenicity and angiogenesis in breast cancer (29-32). The pres- 
ent results imply that gene amplification may be a prominent mech- 
anism for overexpressing HOXB7 in breast cancer and suggest that 
HOXB7 contributes to tumor progression and confers an aggressive 
disease phenotype in breast cancer. This view is supported by our 
finding of amplification of HOXB7 in 10% of 363 primary breast 
cancers, as well as an association of amplification with poor prognosis 
of the patients. 

We carried out a systematic search to identify genes whose 
expression levels across all 14 cell lines were attributable to 
amplification' status. Statistical analysis revealed 270 such genes 
(representing -2% of all genes on the array), including not only 
previously described amplified genes, such as HER-2, MYC t 
EGFR, ribosomal protein s6 kinase, and AJB3, but also numerous 
novel genes such as NRAS-related gene (lpl3), syndecan-2 (8q22), 
and bone morphogenic protein (20ql3.1). whose activation by 
amplification may similarly promote breast cancer progression. 
Most of the 270 genes have not been implicated previously in 
breast cancer development and suggest novel pathogenetic mech- 
anisms. Although we would not expect all of them to be causally 
involved, it is intriguing that 84% of the genes with associated 
functional information were implicated in apoptosis, cell prolifer- 
ation, signal transduction, transcription, or other cellular processes 
that could directly imply a possible role in cancer progression. 
Therefore, a detailed characterization of these genes may provide 
biological' insights to breast cancer progression and might lead to 
the development of novel therapeutic strategies. 

In summary, we demonstrate application of cDNA microarrays 
to the analysis of both copy number and expression levels of over 
12,000 transcripts throughout the breast cancer genome, roughly 
once every 267 kb. This analysis provided: (a) evidence of a 
prominent global influence of copy number changes on gene 
expression levels; (b) a high-resolution map of 24 independent 
amplicons in breast cancer; and (c) identification of a set of 270 
genes, the ovcrexpression of which was statistically attributable to 
gene amplification. Characterization of a novel amplicon at 
1 7q2 1.3 implicated amplification and overexpression of the 
HOXB7 gene in breast cancer, including a clinical association 



between HOXB7 amplification and poor patient prognosis. Overall, 
our results illustrate how the identiFication of genes activated by 
gene amplification provides a powerful approach to highlight 
genes with an important role in cancer as well as to prioritize and 
validate putative targets for therapy development. 



REFERENCES 

1. Gotub, T. R., Slonim, D. K., Tamayo, P., Huard, C, Gaascnbcck. M M Mesirov, J. P., 
Colter, Loh, M. L., Downing, J. R., Caligiuri, M. A„ BloomfieJd. C. D., and 
Lander, E. S. Molecular classification of cancer: class discovery and class prediction 
by gene expression monitoring. Science (Wash. DC), 286: 531-537, 1999. 

2. Aluadeh, A. A., Eisen, M. B., Davis, R. E., Ma, C, Losses, L S-, Roscnwald, A., 
Boldrick, J. C, Sobct, R, Tran, T., Yu, X., et at. Distinct types of diffiwe large B-ceil 
lymphoma identified by gene expression profiling. Nature (Lond.), 403: 503-511, 
2000. 

3. Bittner, M., Meltzcr, P„ Chen, Y., Jiang, Y., Seftor, E.. Hendrix, Radmachcr. M., 
Simon, R-, Yakhini, Z., Ben-Dor. A., et al Molecular classification of cutaneous 
malignant melanoma by gene expression profiling. Nature (Lond.), 406: 536-540, 
2000. 

'4. Pcrou, C. M.. Sorlic, T., Eisen, M. B„ van de Rijn, M., Jeffrey, S. S., Rees, C. A., 
Pollack, J. R.. Ross, D. T., Johnsen, H., AJtslen, L. A., et at Molecular portraits of 
human breast tumours. Nature (Lond.), 406: 747-752, 2000. 

5. Dbanasekaran, S. M., Barrette, T. R., Ghosh, D.. Shah, R., Varambally, S., Kurachi, 
K., Pienta, K. J., Rubin, M. A., and Chinnaiyan, A M. Delineation of prognostic 
biomarkers in prostate cancer. Nature (Lond.), 4)2: 822- 826, 2001. 

6. Sortie, T, Pcrou, C M., Tibshirani, R., Aas, T, Gcislcr. S., Johnsen, H.. Hastie, T.. 
Eisen, M. B-, van de Rijn, M;, Jeffrey, S. S.„ et at. Gene expression patterns of breast 
carcinomas distinguish tumor subclasses with clinical implications. Proc. Nad. Acad. 
Sci. USA, 98: 10869-10874, 2001. 

7. Ross, J. S., and Fletcher, J, A. The HER-2/neu oncogene: prognostic factor, predictive 
factor and target for therapy. Semin* Cancer Biol., 9; 125-138, 1999. 

8. Arteaga, C. L. The epidermal growth factor receptor: from mutant oncogene in 
nonhuman cancers to therapeutic target in human neoplasia. J. Clin. OncoL, t9: 
32-40, 2001. 

9. Knuutila, S., Bjorkqyist, A. Autio, K., Tarkkanen, M.. Wolf, M., Monni, O., 
Szymanska, J., Larramcndy, M. L., Tapper, J., Pcre, H., El-Rifai, W„ el at. DNA copy 
number amplifications in human neoplasms: review or comparative genomic hybrid- 
ization studies. Am. J. Pathol, J52: 1107- 1123. 1998. 

10. Knuutila S., Autio and Aalto Y. Online access to CGH data of DNA sequence 
copy number changes. Am. J. Pathol., 157: 689, 2000. 

11. DcRisi, J- Penland, L,, Brown, P. O., Bittncr, M. L-, Meltzer, P. S., Ray, M., Chen, . 
Y., Su. Y. A., and Trent, J. M. Use of a cDNA microarray to analyse gene expression 
patterns in human cancer. Nat Genet., 14: 457-460. 1996. 

12. Shalon, D., Smith, S. J., and Brown, P. O. A DNA microarray system for analyzing 
complex DNA samples using two-color fluorescent probe hybridization. Genome 
Res., 6: 639-645, 1996. 

13. Mousses, S.. Bittncr, M L., Chen, Y., Dougherty, E. R., Baxcvanis, A., Meltzer, P. S., 
and Trent, J. M. Gene expression analysis by cDN A microarrays. In: F. J. Livesey and 
S. P. Hunt (eds.), Functional Genomics, pp. 1 13-137. Oxford: Oxford University 
Press, 2000. 

14. Pollack, L R., Pcrou, C. M., Alizadch. A. A., Eisen, M. B., Pcrgamcnschikov, A., 
Williams, C. F., Jeffrey, S. S., Botstcin, D„ and Brown, P. O. Genome-wide analysis 
of DNA copy-number changes using cDNA microarrays. Nat Genet, 23: 41-46. 
1999. 

15. Monni, O.. Bfirtund, M., Mousses, S., Kcooncn, J., Sautcr, G., Heiskancn, M., 
Paavola, P., Avela, K., Chen, Y., Bittncr, M. L., and Kollioniemi, A. Comprehensive 
copy number and gene expression profiling of the I7q23 amplicon in human breast 
cancer. Proc. Nail. Acad. Sci. USA, 98: 5711-5716. 2001. 

16. Chen, Y., Dougherty, E. R M and Bittncr, M. L. Ratio-based decisions and the 
quantitative analysis of cDNA microarray images. J. Biomed. Optics, 2: 364-374, 
1997. 

17. BSriund, M.. Forozan, F., Kononcn, J., BubendOTf, L., Chen, Y„ Bittner, M. L., 
Torhorst J., Haas. P., Buchcr, C, Sauter, G., el at. Detecting activation of ribosomal 
protein S6 kinase by complementary DNA and tissue microarray analysis. J. Natl. 
Cancer Inst, 92: 1252-1259, 2000. 

IS. Andersen, C. L., Hostcttcr. G., Grigoryan, A., Sautcr, G., and Kalliontcmi, A. 
Improved procedure for fluorescence In situ hybridization on tissue microarrays. 
Cytometry, 45: 8^-86, 2001. 

19. Kauraniemi, P., Barlund, M.. Monni. G\, and KalHoniemi, A. New amplified and 
highly expressed genes discovered in the ERBB2 amplicon in breast cancer by cDNA 
microarrays. Cancer Res., 6t: 8235-8240, 2001. 

20. Clark, J., Edwards, S., John, M. Flohr, P., Gordon, T., Maillaru, K., Giddings, I., 
Brown, C, Baghcrzaderi, A., Campbell, O, Shipley, J„ Wcoster, R., and Cooper, 
C. S. Identification of amplified and expressed genes in breast cancer by comparative 
hybridization onto microarrays of randomly selected cDNA clones. Genes Chromo- 
somes Cancer, 34: 104-114, 2002. 

21. Vans, A., Wolf, M., Monni, O., Vakkari, M. L., Itokkola, A., Moskaluk, C, Frierson. 
H.. Powell, S. M., Knuutila, S„ Kallionicmi, A., and El-Rifai, W. Targets of gene 
amplification and ovcrexpression at 17q in gastric cancer. Cancer Res., 62: 2625- 
2629, 2002. 

22. Hughes, T. R., Roberts, C J.. Dai, H., Jones, A. R.. Meyer. M. R-, Slide, D.. 
Burchard, J., Dow, S.. Ward, T. R., Kidd, M. J.. Friend, S. H.. and Marion M. I. 



6244 



GENE EXPRESSION PATTERNS IN BREAST CANCER 



Widespread ancuploidy revealed by DNA microarray expression profiling. Nat. 
Genet., 23: 333-337, 2000. 

23. Virtaneva, K., Wright, F. A.. Tanner* S. M.. Yuan, B., Lemon, W. X, Caligiuri, M. A., 
Bloomficld, C. D., de La ChapcUc, A., and Krahe. R. Expression profiling reveals 
fundamental biological difference* in acute myeloid leukemia with isolated trisomy 8 
and normal cytogenetics. Proc. Natl. Acad. Sci.XJSA, 98: 1124-1129, 2001. 

24. Phillips, J. L., Hayward, S. W., Wang, Y., Vossclli, J., PavlovicK C. ( PadiJla-Nash, 
H., Pezlillo. J. R., Ghadimi, B« M.» Grossfcld, G, D., Rivera, A., Linchan, W. M.. 
Cunha, G. R., and Ricd, T. The consequences of chromosomal aneuploidy on gene 
expression profiles in a cell line model for prostate carcinogenesis. Cancer Res., 61: 
8143-8149,2001. 

25. Barlund, M„ Tirkkooen, M., Forozan, F„ Tanner, M. M., Kallioniemi. O. P., and 
Kallioniemi, A. Increased copy number at 17q22-q24 by CGH in breast cancer is due 
to high-level amplification of two separate regions. Genes Chromosomes Cancer, 20: 
372-376, 1997- 

•>6 Tanner, M. M., Tirkkonen, M.. Kallioniemi, A„ Isola, J., Kuukasjarvi, T., Collins; C, 
Kowbei, D.. Guan, X. Y. Trent, J.. Gray. J. W., Mcitzer, P., and Kallioniemi O. P. 
Independent amplification and frequent co-amplification of three nonsyntcnic regions 



on the long arm of chromosome 20 in buunan breast cancer. Cancer Res., 56: 
3441-3445, 1996. 

27. Ctllo, C, Paiclla. A., Cantilc, M, and Boncinclli, E. Homeobox genes and cancer. 
Exp. Cell Res., 248: 1-9, 1999. 

28. Cillo, C, Canrile, M.', Faiclla. A., and Boncimclli, E. Homeobox genes in normal and 
malignant cells. J. Cell. Physiol., J 88: 161-169. 2001. 

29. Care, A., Silvani, A., Mcccia, E., Matti/i, G, Stoppacciaro.. A., Parmiani, G., Peschlc, 
C, and Colombo, M. P. HOXB7 cohstitutively activates basic fibroblast growth 
factor in melanomas. Mo). Cell. Biol., / 6; 4842-4851, 1996. 

30. Care, A., Silvani, A., Mcccia, E., Mattia, <3.. Pcschle, C, and Colombo, M. P. 
Transduction of the SkBr3 breast carcinoma cell line with the HOXB7 gene induces 
bFGF expression, increases cell proliferation and reduces growth factor dependence. 
Oncogene, 16: 3285-3289, 1998. 

31. Care, A., Felicetti, F., Mcccia, E-, Botiero, L„ Parenza, M., Stoppacciaro, A., Peschle, 
C and Colombo, M. P. HOXB7: a key factor for tumor-associated angiogenic switch. 
Cancer Res., 61: 6532-6539, 2001. 

32. Naora, H. f Yang, Y. Q-. Monte. F. J.. Seidraan, J. D„ Kurman, R. J., and Rodcn, R. B. 
A serologically identified tumor antigen encoded by a homeobox gene promotes 
growth of ovarian epithelial cells. Proc. Nail. Acad. Sci. USA, 98: 4060 -4065, 2001. 



6245 



Microarray analysis reveals a major direct role of 
DNA copy number alteration in the transcriptional 
program of human breast tumors 

Jonathan R. Pollack- Therese Sortie*. Charles M. Perou*. O^^^g*** $ * 
Robert Tibshlranl". David Botstein". Anne-Use B B rresen-Oale«, and Patrick O. Brown"" 

Comprehensive Cancer Center, University of North Carolma, Chapel H.H, NC 27599 



Contributed by Patrick O. Brown. August 6, 2002 
Genomic DNA copy number alterations are key genetic events in 
the development and progression of human cancers. Here we 
report a genome-wide microarray comparative genomic hybrid- 
ization (array CGH) analysis of DNA copy number variation in 
a series of primary human breast tumors. We have profiled DNA 
copy number alteration across 6.691 mapped human genes, in 44 
predominantly advanced, primary breast tumors and 10 breast 
cancer cell lines. While the overall patterns of DNA amplification 
and deletion corroborate previous cytogenetic studies, the high- 
resolution (gene-by-gene) mapping of amplicon boundaries and 
the quantitative analysis of amplicon shape provide significant 
improvement in the localization of candidate oncogenes, Parallel 
microarray measurements of mRNA levels reveal the remarkable 
degree to which variation in gene copy number contributes to 
variation in gene expression in tumor cells. Specifically, we find 
that 62% of highly amplified genes show moderately or highly 
elevated expression, that DNA copy number influences gene ex- 
pression across a wide range of DNA copy number alterations 
(deletion, low-, mid- and high-level amplification), that on average, 
a 2-fold change in DNA copy number Is associated with a corre- 
sponding 1.5-fold change in mRNA levels, and that overall, at least 
12% of all the variation in gene expression among the breast 
tumors is directly attributable to underlying variation in gene copy 
number. These findings provide evidence that widespread DNA 
copy number alteration can lead directly to global deregulation of 
gene expression, which may contribute to the development or 
progression of cancer. 

Conventional cytogenetic techniques, including comparative 
genomic hybridization (CGH) (1), have led to the identifi- 
cation of a number of recurrent regions of DNA copy number 
alteration in breast cancer cell lines and tumors (2-4). While 
some of these regions contain known or candidate oncogenes 
leg. FGFR1 (8pll), MYC (8q24) ( CCND1 (llql3), ERBB2 
(17ql2) f and ZNF217 (20ql3)j and tumor suppressor genes 
[RBI (13ql4) and TP53 (17pl3)], the relevant gene(s) within 
other regions (e.g. gain of lq. 8q22, and 17q22-24. and loss of 
8p) remain to be identified. A high-resolution genome-wide 
map, delineating the boundaries of DNA copy number alter- 
ations in tumors, should facilitate the localization and identifi- 
cation of oncogenes and tumor suppressor genes in breast 
cancer. In this study, we have created such a map, using 
array-based CGH (5-7) to profile DNA copy number alteration 
in a series of breast cancer cell lines and primary tumors. 

An unresolved question is the extent to which the widespread 
DNA copy number changes that we and others have identified 
in breast tumors alter expression of genes within mvolved 
regions. Because we had measured mRNA levels in parallel m 
the same samples (8). using the same DNA microarrays, we had 
an opportunity to explore on a genomic scale the relationship 
between DNA copy number changes and gene expression. From 
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this analysis, we have identified a significant impact of wide- 
spread DNA copy number alteration on the Uahscriptional 
programs of breast tumors. 

Materials and Methods 

Tumors and Cell Lines. Primary breast tumors were predominantly 
large (>3 cm), intermediate-grade, infiltrating ductal carcino- 
mas, with more than 50% being lymph node positive. The 
fraction of tumor cells within specimens averaged at least 50%. 
Details of individual tumors have been published (8, 9), and 
are summarized in Table 1, which is published as supporting 
information on the PNAS web site, www.pnas.org. Breast cancer 
cell lines were obtained from the American Type Culture 
Collection. Genomic DNA was isolated either using Qiagen 
genomic DNA columns, or by phenol/chloroform extraction 
followed by ethanol precipitation. 

DNA Labeling and Microarray Hybridizations. Genomic DNA label- 
ing and hybridizations were performed essentially as described 
in Pollack et al. (7), with slight modifications. Two micrograms 
of DNA was labeled in a total volume of 50 microliters and the 
volumes of all reagents were adjusted accordingly. <4 Test**DNA 
(from tumors and cell lines) was fiuorescenUy labeled (Cy5) and 
hybridized to a human cDNA microarray containing 6,691 
different mapped human genes (i.e., UniGene clusters). The 
"reference" (labeled with Cy3) for each hybridization was nor- 
mal female leukocyte DNA from a single donor. The fabrication 
of cDNA microarrays and the labeling and hybridization of 
mRNA samples have been described (8). 

Data Analysis and Map Positions. Hybridized arrays were scanned 
on a GenePix scanner (Axon Instruments, Foster City, CA), and 
fluorescence ratios (test/reference) calculated using scanalyze 
software (available at http://rana.lbl.gov). Fluorescence ratios 
were normalized for each array by setting the average log 
fluorescence ratio for all array elements equal to 0. Measure- 
ments with fluorescence intensities more than 20% above back- 
ground were considered reliable. DNA copy number profiles 
that deviated significantly from background ratios measured in 
normal genomic DNA control hybridizations were interpreted as 
evidence of real DNA copy number alteration (see Estwuiting 
Significance of Altered Fluorescence Ratios in the supporting 
information). When indicated, DNA copy number profiles are 
displayed as a moving average (symmetric 5-nearest neighbors). 
Map positions for arrayed human cDNAs were assigned by 
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identifying the starting position of the best and longest match of 
any DNA sequence represented in the corresponding UmGene 
cluster (10) against the "Golden Path" genome assembly 
(http://gcnome.ucsc.edu/; Oct 7, 2000 Freeze). For UmGene 
clusters represented by multiple arrayed elements, mean fluo- 
rescence ratios (for all elements representing the same UmGene 
cluster) are reported. For mRNA measurements, fluorescence 
ratios are "mean-centered" (i.e., reported relative to the mean 
ratio across the 44 tumor samples). The data set described here 
can be accessed in its entirety in the supporting information. 

Results 

We performed CGH on 44 predominantly locally advanced, 
primary breast tumors and 10 breast cancer cell lines, using 
cDNA microarrays containing 6,691 different mapped human 
genes (Fig la; also see Materials and Methods for details of 
microarray hybridizations). To take full advantage of the im- 
* proved spatial resolution of array CGH, we ordered (fluores- 
cence ratios for) the 6,691 cDNAs according to the "Golden 
Path" (http://genome.ucsc.edu/) genome assembly of the draft 
human genome sequences (11). In so doing, arrayed cDNAs not 
only themselves represent genes of potential interest (e.g., 
candidate oncogenes within amplicons), but also provide precise 
genetic landmarks for chromosomal regions of amplification and 
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deletion. Parallel analysis of DNA from cell lines containing 
different numbers of X chromosomes (Fig. 16), as we did before 
(1\ demonstrated the sensitivity of our method to detect single- 
copy loss (45, XO), and 13- (47,XXX). 2* (48,XXXX), or 
23-fold (49,XXXXX) gains (also see Fig. 5, which is published 
as supporting information on the PNAS web site). Ruorc«cence 
ratios were linearly proportional to copy number ratios, which 
were slightly underestimated, in agreement with previous ob- 
servations (7). Numerous DNA copy number alterations were 
evident in both the breast cancer cell lines and primary tumors 
(Fig lfl)> detected in the tumors despite the presence of euploid 
non-tumor cell types; the magnitudes of the observed changes 
were generally lower in the tumor samples. DNA copy-number 
alterations were found in every cancer cell line and tumor, and 
on every human chromosome in at least one sample. Recurrent 
regions of DNA copy number gain and loss were readily iden- 
tifiable. For example, gains within Iq, 8q, 17q, and 20q were 
observed in a high proportion of breast cancer cell lines/ tumors 
(90%/69%, 100%/47%, 100%/60%, and 90%/44%, respectivc- 
W) as were losses within lp t 3p, 8p, and 13q (80%/24%, 
80%/22%, 80%/22%, and 70%/18%, respectively), consistent 
with published cytogenetic studies (refe. 2-4; a complete listing 
of gains/losses is provided in Tables 2 and 3, which are published 
as supporting information on the PNAS web site). The total 
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number of genomic alterations (gains and losses) was found to 
be significantly higher in breast tumors that were high grade (P - 
0.008), consistent with published CGH data (3), estrogen recep- 
tor negative {P » 0.04), and harboring TP53 mutations (P = 
0.0006) (see Table 4 ; which is published as supporting informa- 
tion on the PNAS web site). m . 

The improved spatial resolution of our array CGH analysis is 
illustrated for chromosome 8, which displayed extensive DNA 
copy number alteration in our series. A detailed view of the 
variation in the copy number of 241 genes mapping to chromo- 
some 8 revealed multiple regions of recurrent amplification; 
each of these potentially harbors a different known or previously 
uncharacterized oncogene (Fig. 2a), The complexity of amphcon 
structure is most easily appreciated in the breast cancer cell line 
SKBR3. Although a conventional CGH analysis of 8q in SKBR3 
identified only two distinct regions of amplification (12), we 
observed three distinct regions of high-level amplification (la- 
beled 1-3 in Fig. 2b). For each of these regions we can define the 



boundaries of the interval recurrently amplified in the tumors we 
examined; in each case, known or plausible candidate oncogenes 
can be identified (a description of these regions, as well as the 
recurrently amplified regions on chromosomes 17 and 20, can be 
found in Figs. 6 and 7, which are published as supporting 
information on the PNAS web site). 

For a subset of breast cancer cell lines and tumors (4 and 37, 
respectively), and a subset of arrayed genes (6,095), mRNA 
levels were quantitatively measured in parallel by using cDNA 
microarrays (8). The parallel assessment of mRNA levels is 
useful in the interpretation of DNA copy number changes. For 
example, the highly amplified genes that are also highly ex- 
pressed are the strongest candidate oncogenes within an amph- 
con. Perhaps more significantly, our parallel analysis of DNA 
copy number changes and mRNA levels provides us the oppor- 
tunity to assess the global impact of widespread DNA copy 
number alteration on gene expression in tumor cells. 

A strong influence of DNA copy number on gene expression 
is evident in an examination of the pseudocolor representations 



Polladc et ai 



WAS | October 1,2002 1 vol.99 I no. 20 | 12965 



BT*74 
T47D 



27* 

NORWAY 7 

NORWAY tOO 

stahford if 

STANFORD 36 

NOfwerm 



J 61 



norwctm 
stamford 33 

mi 

STANFORD t« 



9SBS0M ¥ 

NORWAY «» 




17pter 

. 17 dna cow number alteration (Upper) And mRNA levels (Lower) 

«g.3. ConcordancebetweenDNAcopynumo^^^ 

are Illustrated for breast cancer cell lines and tumor,. Breast 'a™'"'™ and for which both DMA copy number 

identkalsampleorderis maintained Skated in coined text {see Fig. 2 legend* 

and mRNA levels were determined, are ordered by pos.tion along the chromosome, genCT 
Ruorelcence7^ (ten/reference) are depicted by separate log, pseudocolor scales undated). 



of DNA copy number and mRNA levels for genes on chromo- 
some 17 (Fig. 3). The overall patterns of gene amplification and 
elevated gene expression are quite concordant; i.e., a significant 
fraction of highly amplified genes appear to be corresponding^ 
highly expressed. The concordance between high-level amplifi- 
cation and increased gene expression is not restricted to chro- 
mosome 17. Genome-wide, of 117 high-level DNA amplifica- 
tions (fluorescence ratios >4, and representing 91 different 
genes), 62% (representing 54 different genes; see Table 5, which 
is published as supporting information on the PNAS web site) 
are found associated with at least moderately elevated mRNA 
levels (mean-centered fluorescence ratios >2), and 42% (rep- 
resenting 36 different genes) are found associated with compa- 
rably highly elevated mRNA levels (mean-centered fluorescence 
ratios >4). 

To determine the extent to which DNA deletion and lower- 
level amplification (in addition to high-level amplification) are 
also associated with corresponding alterations in mRNA levels, 
we performed three separate analyses on the complete data set 
(4 cell lines and 37 tumors, across 6,095 genes). First, we 
determined the average mRNA levels for each of five classes 
of genes, representing DNA deletion, no change, and low-, 
medium-, and high-level amplification (Fig. 4a). For both the 
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breast cancer cell lines and tumors, average mRNA levels 
tracked with DNA copy number across all five classes, m a 
statistically significant fashion (P 

MesU comparing ac«a«n^^ >. * g> • 

5 x irr 5 1 x 10' 2 ; tumors, 1X10 4J ,1X10 ",5xiu , 
1 x irr 4 ). A linear regression of the average *og(DNA copy 
number), for each class, against average log(mRNA level) 
Sedated that on average, a 2-fold , change m DNA copy 
number was accompanied by 1.4- and 1.5-fold changes m mRNA • 
level for the breast cancer cell lines and tumors, respectively (Fig. 
4a regression line not shown). Second, we characterized the 
distribution of the 6,095 correlations between DNA copy num- 
ber and mRNA level, each across the 37 tumor samples (Fig. 46). 
The distribution of correlations forms a normal-shaped curve, 
but with the peak markedly shifted in the positive direction from 
zero. This shift is statistically significant, as evidenced m a plot 
of observed vs. expected correlations (Fig. 4c), and reflects a 
pervasive global influence of DNA copy number alterations .on 
gene expression. Notably, the highest correlations between i DNA 
copy number and mRNA level (the right tail of the distribution 
in Fig 46) comprise both amplified and deleted genes (data nol 
shown). Third, we used a linear regression model to estimate the 
fraction of all variation measured in mRNA levels among the 37 
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tumors that could be attributed to underlying variation in DNA 
copy number. From this analysis, we estimate that, overall, about 
7% of all of the observed variation in mRNA levels can be 
explained directly by variation in copy number of the altered 
genes (Fig. Ad). We can reduce the effects of experimental 
measurement error on this estimate by using only that fraction 
of the data roost reliably measured (fluorescence intensity/ 
background >3); using that data, our estimate of the percent 
variation in mRNA levels directly attributed to variation in gene 
copy number increases to 12% (Fig. 4rf). This still undoubtedly 
represents a significant underestimate, as the observed variation 
in global gene expression is affected not only by true variation in 
the expression programs of the tumor cells themselves, but also 
by the variable presence of non-tumor cell types within clinical 
samples. 

Discussion 

This genome-wide, array CGH analysis of DNA copy number 
alteration in a series of human breast tumors demonstrates the 
usefulness of defining amplicon boundaries at high resolution 
(gene-by-gene), and quantitatively measuring amplicon shape, to 
assist in locating and identifying candidate oncogenes. By ana- 
lyzing mRNA levels in paraUel, we have also discovered that 
changes in DNA copy number have a large, pervasive, direct 
effect on global gene expression patterns in both breast cancer 
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cell lines and tumors. Although the DNA microarrays used in our 
analysis may display a bias toward characterized and/or highly 
expressed genes, because we are examining such a large fraction 
of the genome (approximately 20% of all human genes), and 
because, as detailed above, we are likely underestimating the 
contribution of DNA copy number changes to altered gene 
expression, we believe our findings are likely to be generalizable 
(but would nevertheless still be remarkable if only applicable to 
this set of -6,100 genes). 

In budding yeast, aneuploidy has been shown to result in 
chromosome-wide gene expression biases (13). Two recent 
studies have begun to examine the global relationship between 
DNA copy number and gene expression in cancer cells. In 
agreement with our findings, Phillips et al. (14) have shown that 
with the acquisition of tumorigenicity in an immortalized pros- 
tate epithelial cell line, new chromosomal gains and losses 
resulted in a statistically significant respective increase and 
decrease in the average expression level of involved genes. In 
contrast, Platzer et al (15) recently reported that in metastatic 
colon tumors only -4% of genes within amplified regions were 
found more highly (>2-fold) expressed, when compared with 
normal colonic epithelium. This report differs substantially from 
our finding that 62% of highly amplified genes in breast cancer 
exhibit at least 2-foJd increased expression. These contrasting 
findings may reflect methodological differences between the 
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studies. For example, the study of Plater et oL (15) may have 
systematically under-measured gene expression changes. In this 
regard it is remarkable that only 14 transcripts of many thousand 
residing within unamplified chromosomal regions were found to 
exhibit at least 4-fold altered expression in metastatic colon 
cancer. Additionally, their reliance on lower-resolution chromo- 
somal CGH may have resulted in poorly delimiting the bound- 
aries of high-complexity amplicons, effectively overcalhng re- 
gions with amplification. Alternatively, the contrasting findings 
for amplified genes may represent real biological differences 
between breast and metastatic colon tumors; resolution of this 
issue will require further studies. , 

Our finding that widespread DNA copy number alteration has 
a large pervasive and direct effect on global gene expression 
patterns in breast cancer has several important implications. 
First, this finding supports a high degree of copy number- 
dependent gene expression in tumors. Second, it suggests that 
most genes arc not subject to specific autoregulation or dosage 
compensation. Third, this finding cautions that elevated expres- 
sion of an amplified gene cannot alone be considered strong 
independent evidence of a candidate oncogene's role m rumor- 
ieenesis. in our study, fully 62% of highly amplified genes 
demonstrated moderately or highly elevated expression. This 
highlights the importance of high-resolution mapping of ampli- 
con boundaries and shape [to identify the "drrvmg gene(s) 
within amplicons (16)], on a large number of samples, in addition 
to functional studies. Fourth, this finding suggests that analyzing 
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the genomic distribution of expressed genes, even within existing 
microarray gene expression data sets, may permit the inference 
of DNA copy number aberration, particularly aneuploidy (where 
gene expression can be averaged across large ^romosomal 
regions; see Fig. 3 and supporting mformation). Fifth, this 
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behavior) among patients' tumors may be traceable to underly- 
ing variation in DNA copy number. Sixth, this finding supports 
a possible role for widespread DNA copy number alteration in 
tumorigenesis (17, 18), beyond the amplification of specific 
oncogenes and deletion of specific tumor suppressor genes. 
Widespread DNA copy number alteration, and the concomitant 
widespread imbalance in gene expression, might disrupt critical 
stochioroetric relationships in cell metabolism and physiology 
(e g proteosome, mitotic spindle), possibly promoting further 
chromosomal instability and directly contributing to tumor 
development or progression. Finally, our findings suggest the 
possibility of cancer therapies that exploit specific or global 
imbalances in gene expression in cancer. 

We thank the many members of the P.O.B. and D.B. late for helpful 
discussions. J.R.P. was a Howard Hughes Medical Institute Physician 
Postdoctoral Fellow during a portion of this work. P.OJJ. . * a Howard 
Hughes Medical Institute Associate Investigator. This work was 
supported by grants from the National Institutes of Health, the Howard 
Hughes Medical Institute, the Norwegian Cancer Society, and the 
Norwegian Research Council 

Eisen, M. B.. van dc Rijn, Nt, Jeffrey. S. et at. (2001) Pn>c Ned. Acad. 
Sci. USA 98, 10869-10874. 

10 Scbulcr, C. D. (1997) X Mot, Med. 75, 694-698. 

11 Lender E S„ Unton. 1~ M„ Birren, B., Nasbaum. C, Zody, M. C Baldwm, 
K Pev^; £. K. Doyle, M, FitzHugh, W„ et aL (2001) Nature 

XL ?«S5t, Chen. C. IMta F. A Gray, J. W. (1998) Genes 

Chromosomes Cancer 22, 105-113. 

13 Hughes, T. Roberts, C J. Dai, H„ Jones, A. R, Meyer, M. R, Slade, D 
BuTbwd. J.. Dow. S„ Ward, T. R., Kidd. M, J., « A (2000) Nat. Genet. 15, 

14 PWUi^ J. U. Hayward. S. W. t Wang. Y., VasscBi, J. PaWovich, O. Padilla- 
wlKlL PeiuHoTTR-, Ghadimi, B. Grossfcld. G, Rivera, A., et at. 
(2001) Cancer Res. «, 8143-8149. 

15. Platzcr. P., Upender, M. B., WiUon, Willis, ^, l^tlcrbaugb, J. Nosniu, 
wnison. i ICMacIu D.. Ried, T. & Markowto, S. (2002) Cancer Res. 62, 
1134—1138 

16 Aibcrtson,*D. G., Ylstra. B„ Segraves, R., Collins, C. Dairkee, S. H.. Kowbej, 
SI, Kuo, W. U Gray, J. W. & Pinkel, D. (2000) Atai. Genet. 25, 

17. U R^Yerganian, G., Ducsberg. P. Kracmer, A., Winer^ A^ , Rausch, C A 
Hchlmann, R. (1997) Proc Natl Acad. Sci.VSA f^J^^ 

18. Rasnick, D. & Duesbcrg, P. H. (1999) Biochem. I 340, 621-630. 



12968 | www.pnas.org/cgi/doi/10.l073/pnas.t62471999 



Pollack rt »L 



■Ill TECHNICAL UPDATE 

FROM YOUR LABORATORY SERVICES PROVIDER 

HER-2/neu Breast Cancer Predictive Testing 

Julie Sanford Hanna, Ph.D. and Dan Mornin, M.D. 



Each year, over 182,000 .women in the United States are 
diagnosed with breast cancer, and approximately 45,000 die 
of the disease. 1 Incidence appears to be increasing in the 
United States at a rate of roughly 2% per year. The reasons 
for the increase are unclear, but non-genetic risk factors appear 
to play a large role. 2 

Five-year survival rates range from approximately 65%- 
85%, depending on demographic group, with a significant 
percentage of women experiencing recurrence of their cancer 
within 10 years of diagnosis. One of the factors most predic- 
tive for recurrence once a diagnosis of breast cancer has been 
made is the number of axillary lymph nodes to which tumor 
has metastasized. Most node-positive women are given .adju- 
vant therapy, which increases their survival. However, 20%- 
30% of patients without axillary node involvement also 
develop recurrent disease, and the difficulty lies in how to iden- 
tify this high-risk subset of patients. These patients could 
benefit from increased surveillance, early intervention, and 
treatment. 

Prognostic markers currently used in breast cancer recur- 
rence prediction include tumor size, histological grade, steroid 
hormone receptor status, DNA ploidy, proliferative index, and 
cathepsin D status. Expression of growth factor receptors and 
over-expression of the HER-2/neu oncogene have also been 
identified as having value regarding treatment regimen and 
prognosis. 

HER-2/neu (also known as c-erbB2) is an oncogene that 
encodes a transmembrane glycoprotein that is homologous 
to, but distinct from, the epidermal growth factor receptor. 
Numerous studies have indicated that high levels of expres- 
sion of this protein are associated with rapid tumor growth, 
certain forms of therapy resistance, and shorter disease-free 
survival. The gene has been shown to be amplified and/or 
overexpressed in 1 0%-30% of invasive breast cancers and in 
40%-60% of intraductal breast carcinoma. 3 

There are two distinct FDA-approved methods by which 
HER-2/neu status can be evaluated: imniunohistochemistry 
(1HC, HercepTest™) and FISH (fluorescent in situ hybridiza- 
tion, PathVysion™ Kit). Both methods can be performed on 
archived and current specimens. The first method allows visual 
assessment of the amount of HER-2/neu protein present on 
the cell membrane. The latter method allows direct quantifi- 
cation of the level of gene amplification present in the tumor, 
enabling differentiation between low- versus high-amplifica- 
tidh. At least one study has demonstrated a difference in 



recurrence risk in women younger than 40 years of age for 
low- versus high-amplified tumors (54.5% compared to 
85.7%); this is compared to a recurrence rate of 16.?% for 
patients with no HER-2/neu gene amplification. 4 HER-2/neu 
status may be particularly important to establish in women with 
small (< 1 cm) tumor size. 

The choice of methodology for determination of HER- 2/ 
neu status depends in part on the clinical setting. FDA approval 
for the Vysis FISH test was granted based on clinical trials 
involving 1549 node-positive rJatients. Patients received one 
of three different treatments consisting of different doses of 
cyclophosphamide, Adriamycin, and 5-fluorouracil (CAF). 
The study showed that patients with amplified HER-2/neu 
benefited from treatment with higher doses of adriamycin- 
based therapy, while those with normal HER-2/neu levels did 
not. The study therefore identified a sub-set of women, who 
because they did not benefit from more aggressive treatment, 
did not need to be exposed to the associated side effects. In 
addition, other evidence indicates that HER-2/neu amplifica- 
tion in node-negative patients can be used as an independent 
prognostic indicator for early recurrence, recurrent disease at 
any time and disease-related death. 5 Demonstration of HER- 
2/neu gene amplification by FISH has also been shown to be 
of value in predicting response to chemotherapy in stage-2 
breast cancer patients. 

Selection of patients for Herceptin® (Trastuzumab) mono- 
clonal antibody therapy, however, is based upon demonstra- 
tion of HER-2/neu protein overexpression using HercepTest™. 
Studies using Herceptin® in patients with metastatic breast 
cancer show an increase in time to disease progression, 
increased response rate to chemotherapeutic agents and a small 
increase in overall survival rate. The FISH assays have not yet 
been approved for this purpose, and studies looking at response 
to Herceptin® in patients with or without gene amplification 
status determined by FISH are in progress. 

In general, FISH and IHC results correlate well. However, 
subsets of tumors are found which show discordant results; 
i.e., protein overexpression without gene amplification or lack 
of protein overexpression with gene amplification. The clini- 
cal significance of such results is unclear. Based on the above 
considerations, HER-2/neu testing at SHMC/PAML will uti- 
lize immunohistochemistry (HercepTest* 0 ) as a screen, fol- 
lowed by FISH in IHC-negative cases. Alternatively, either 
method may be ordered individually depending on the clini- 
cal setting or clinician preference. 



AUG US V 19 99 



CPT code information 



References 



HER-2/neu via IHC 

88342 (including interpretive report) 

HER-2/neu via FISH 

88271 *2 Molecular cytogenetics, DNA probe, each 
88274 Molecular cytogenetics, interphase in situ hybrid- 
ization, analyze 25-99 cells 
8829 1 Cytogenetics and molecular cytogenetics, interpre- 
tation and report 

Procedural Information 

Immunohistochemistry is performed using the FDA-approved 
DAKO antibody kit, Herceptest®. The DAKO kit contains 
reagents required to complete a two-step irnmunohisto- 
chemical staining procedure forroutinely processed, paraffin- 
embedded specimens. Following incubation with the primary 
rabbit antibody to human HER-2/neu protein, the kit employs 
a ready-to-use dextran-based visualization reagent. This re- 
agent consists of both secondary goat anti-rabbit antibody 
molecules with horseradish peroxidase molecules linked to a 
common dextran polymer backbone, thus eliminating the need 
for sequential application of link antibody and peroxidase 
conjugated antibody. Enzymatic conversion ef the subse- 
quently added chromogen results in formation of visible 
reaction product at the antigen site. The specimen is then coun- 
terstained; a pathologist using light-microscopy interprets 
results. 

■ FISH analysis at SHMC/PAML is performed using the 
FDA-approved PathVysion™ HER-2/neu DNA probe kit, pro- 
duced by Vysis, Inc. Formalin fixed, paraffin-embedded breast 
tissue is processed using routine histological methods, and then 
slides are treated to allow hybridization of DNA probes to the 
nuclei present in the tissue section. The Pathvysion™ kit con- 
tains two direct-labeled DNA probes, one specific for the 
alphoid repetitive DNA (CEP 1 7, spectrum orange) present at 
the chromosome 17 centromere and the second for the HER- 
2/neu oncogene located at 17ql 1.2-12 (spectrum green). Enu- 
meration of the probes allows a ratio of the number of copies 
of chromosome 17 to the number of copies of HER-2/neu to 
be obtained; this enables quantification of low versus high 
amplification levels, and allows an estimate of the percentage 
of cells with HER-2/neu gene amplification. The clinically 
relevant distinction is whether the gene amplification is due 
to increased gene copy number on the two chromosome 17 
homologues normally present or an increase in the number of 
chromosome 17s in the cells. In the majority of cases, ratio 
equivalents less than 2.0 are indicative of a normal/negative 
result, ratios of 2.1 and over indicate that amplification is 
present and to what degree. Interpretation of this data will be 
performed and reported from the Vysis-certified Cytogenet- 
ics laboratory at SHMC. 
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ABSTRACT The consistent cytogenetic translocation of 
chronic myelogenous leukemia (the Philadelphia chromosome, 
Ph 1 ) has been observed in cells of multiple hematopoietic 
lineages. This translocation creates a chimeric gene composed 
of breakpoint-cluster-region (bcr) sequences from chromosome 
22 fused to a portion of the abl oncogene on chromosome 9. The 
resulting gene product (P210 c ' Bbl ) resembles the transforming 
protein of the Abelson murine leukemia virus in Its structure 
and tyrosine kinase activity. P210 c,bl Is expressed in Ph 1 - 
positive cell lines of myeloid lineage and in clinical specimens 
with myeloid predominance. We show here that Epstein-Barr 
virus-transformed B-lymphocyte lines that retain Ph 1 can 
express P210 c * abl , The level of expression in these B-ceil lines is 
generally lower and more variable than that observed for 
myeloid lines. Protein expression Is not related to amplification 
of the abl gene but to variation In the level of bcr-abl mRNA 
produced from a single Ph 1 template. 



Chronic myelogenous leukemia (CML) is a disease of the 
pluripotent stem cell (1). In greater than 95% of patients, the 
leukemic cells contain the cytogenetic marker known as the 
Philadelphia chromosome, or Ph 1 (2). This reciprocal 
translocation event between the long arms of chromosomes 
9 and 22 has been used as a disease-specific marker for 
diagnosis and evaluation of therapy. Multiple hematopoietic 
lineages, including myeloid and B-lymphoid, contain Ph 1 in 
early or chronic phase, as well as in the more acute accel- 
erated and blast crisis phases of the disease. 

One molecular consequence of Ph* is the translocation of 
the chromosomal arm containing the Q-abl gene on chromo- 
some 9 into the middle of the breakpoint-cluster region (bcr) 
gene on chromosome 22 (3-6). Although the precise 
translocation breakpoints are variable, an RNA-splicing 
mechanism generates a very similar 8-kilobase (kb) mRNA in 
each, case (5-9). The hybrid bcr-abl message encodes a 
structurally altered form of the abl oncogene product, called 
P2iO c abl (10-13), with an amino-terminal segment derived 
from a portion of the exons of bcr on chromosome 22 and a 
carboxyl-terminal segment derived from a major portion of 
the exons of the c-abl gene on chromosome 9. The chimeric 
structure of bcr-abl and the resulting P21(P abl is similar to the 
structure of the Abelson murine leukemia virus gag-abl 
genome and resulting P160 v abl transforming gene product. 
Both proteins have very similar tyrosine kinase activities (10, 
11, 14) which can be distinguished by their relative stability 
to denaturing detergents and by their ATP requirements from 
the recently described tyrosine kinase activity of the c-abl 
gene product (15). 
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In concert with structural modification of the amino- 
terminal portion of the abl gene, increased level of expression 
has been implicate^ in activation of c-abl oncogenic poten- 
tial. Myeloid and erythroid cell lines and clinical samples 
derived from acute-phase CML patients contain about 10- 
fold higher levels of the 8-kb bcr-abl mRNA and P210 cabl than 
the c-abl mRNA forms (6 and 7 kb) and P145 c abl gene product 
(5, 8, % li) f The higher level of expression of the chimeric 
bcr-abl message in acute-phase cells is not likely to be solely 
due to the presence of the bcr promoter sequences at the 5' 
end of the gen$, since the normal 4.5-kb and 6.7-kb bcr- 
encoded mRNA species are expressed at an even lower level 
than the normal c-abl messages (5, 6).. 

We have analyzed a series of Epstein-Barr vims-immor- 
talized B-lymphoid cell lines derived from CML patients (16). 
With such in vitro clonal cell lines, we can evaluate whether 
the presence of Ph 1 always results in synthesis of the chimeric 
bcr-abl message and protein, and whether the quantitative 
expression varies for cells of B-lymphoid lineage as com- 
pared to previously examined myeloid cell lines. Our results 
show that cell lines that retain Ph 1 do express bcr-abt message 
and protein, but that the level is generally lower and more 
variable' than previously seen for myeloid cell lines. The 
demonstration that the Ph 1 chromosomal template can vary 
in its level of expression of P210 cabl suggests that secondary 
mechanisms, beyond the translocation itself, contribute to 
the regulation of the bcr-abl gene in different cell types or 
subclones that derive from the affected stem cell. 

MATERIALS AND METHODS 

Cells and Cell Labelings. Epstein-Barr virus-transformed 
B-lymphoid cell lines were established from peripheral blood 
samples of chronic- and acute-phase CML patients as report- 
ed (16). The cell lines are designated according to patient 
number, karyotype, and lineage. For example, SK- 
CML7Bt(9,22)-33 refers to CML patient 7, B-lymphoid cell 
line, 9;22 translocation (Ph 1 ), cell line 33; and SK-CML7BN- 
2 refers to B-cell line 2 with a normal karyotype derived from 
the same patient. Repeat karyotype analysis was performed 
to verify the retention of Ph 1 just prior to analysis for abl 
protein and RNA. Cells were maintained in RPMI 1640 
medium with 20% fetal bovine serum. We have not observed 
any consistent pattern of in vitro growth rate that correlates 
to the stage of disease at the time of transformation with 
Epstein-Barr virus. Cells (1.5 x 10 7 ) were washed twice with 
DulbeccoV modified Eagle's medium lacking phosphate and 
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supplemented with 5% dialyzed fetal bovine serum. Cells 
were then resuspended in 2 ml of the minimal medium. 
Labeling was started with the addition of [ 32 P]orthophos- 
phate (1 mCi/ml; ICN; 1 Ci = 37 GBq) and continued at 37°C 
for 3-4 nr. 

Immunoprecipitatlon and Immunoblotting. Immunoprecip- 
itations were carried out as described (10). Cells (1.5 x 10 7 ) 
were washed with phosphate-buffered saline and extracted 
with 3-5 ml of phosphate lysis buffer (1% Triton X-100/0.1 
NaDodSO 4 /0.5% deoxycholate/10 mM Na 2 HP0 4 , pH 7.5/ 
100 mM NaCl) with 5 mM EDTA and 5 mM phenylmethyl- 
sulfonyl fluoride. Extracts were clarified by centrifugation 
and precipitated with normal or rabbit anti-abl sera (anti- 
pEX-2 or anti-pEX-5) (17)i The precipitated proteins were 
electrophoresed in a NaDodS0 4 /8% polyacrylamide gel. 
32 P-labeled proteins were detected by autoradiography. 
Alternatively, abl proteins were detected by immunoblotting. 
Extracts from unlabeled cells were clarified, and proteins 
were concentrated by immundprecipitation with rabbit anti- 
sera against aW-encoded proteins [anti-pEX-2 and anti-ppX- 
5 combined (17)] and then fractionated in %% acrylamide gels. 
The proteins were transferred from the gel to nitrocellulose 
filters, using protease-facilitated transfer (18). The abl- 
encoded proteins were detected using murine monoclonal 
antibodies as a probe and peroxidase-conjugated goat anti- 
mouse second stage antibody (Bio-Rad) for development. 
Rabbit antisera and mouse monoclonal antibodies to abl 
proteins were prepared using bacterially expressed regions of 
the v-abl protein as immunogens (17, 19). Anti-pEX-2 anti- 
bodies react with the internal tyrosine kinase domain and 
anti-pEX-5 antibodies react with the carboxyl-terminal seg- 
ment of the abl proteins. 

RNA Analysis. RNA was extracted from 10 8 cells by the 
NaDodS0 4 /urea/phenol method (20). Polyadenylylated 
RNA was purified by oligo(dT) affinity chromatography. 
Samples were electrophoresed in a 1% agarose/formalde- 
hyde gel and transferred to nitrocellulose, abl RNA species 
were detected by hybridization with a nick-translated v-abl 
fragment probe (21). 

DNA Analysis. DNA was prepared from 5 x 10 7 cells of 
each cell line and processed for Southern blots with a v-abl 
probe as described (21). 

RESULTS 

Variable Levels of P210 c - abl Are Detected In Ph'-Positive Cell 
Lines. Ph'-positive and Ph^negative, Epstein-Barr virus- 
transformed B-lymphocyte cell lines derived from the same 
patient were examined for P210 cabI synthesis by immuno- 
precipitation of [ 3i P]orthophosphate-labeled cell extracts 
with anti-abl sera (Fig. 1). The normal c-abl protein P145 c " abl 
was detected at a similar level in multiple Prepositive and 
Ph'-negative cell lines. P210 cab! was only detected in the 
Ph^positive cell lines because the bcr-abl chimeric gene 
which encodes P210 C flbl resides on the Ph l (4, 5, 11, 13). The 
level of P210 cabl was about 4- to 5-fold higher than the level 
of P145 c abl in the SK-CML7Bt-33 cell line (Fig. 1A, +). The 
Ph l -positive erythroid-progenitor cell line K562 (C) showed 
a level of P210 C ftbI about 10-fold higher than P145" w . 
However, the level of P210 cabl was about one-fifth that of 
P145 c abl in the Prepositive SK-CML16BM cell line (Fig. IB, 
+). Comparison of different autoradiographic exposures 
roughly indicated that the level of P210 cabl varies over a 
20-fold range between these Ph^positive B-cell lines. Anal- 
ysis of four additional Ph^positive B-cell lines demonstrated 
that the level of P210 c ' abI fell into two general classes; some 
cell lines had a level of P210 c - ebl similar to SK-CML7Bt-33 
and others had the low level similar to SK-CML16BM (Table 
1). This differs from previous studies with Prepositive 
myeloid cell lines and patient samples derived from acute- 
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Fig. 1. Detection of variable levels of P210 cabl in PhVpositive 
B-cell lines. Production of P145 c abl and P210 c abl in Epstein-Barr 
virus-rtransformed B-cell lines derived from a blast-crisis (A) and a 
chronic-phase (B) CML patient was examined by metabolic labeling 
with ["PJorthophosphate and immunoprecipitation. Ph^negative 
(-) and Prepositive (+) cell lines derived from each patient were 
analyzed. The Ph^negative cell line in A,- is SK-CML7BN-2 and in 
B- is SK-CML16BN-1. The Ph l -positive cell line in A,+ is 
SK-CML7Bt-33 and in B,+ is SK-CML16Bt-l. The K562 cell line, a 
Ph^positive erythroid progenitor cell line spontaneously derived 
from a blast-crisis patient (33), is represented in C. Cells (1.5 x 10 7 ) 
were metabolically labeled with 2 mCi of [ 32 P]orthophosphate for 3-4 
hr and then were extracted and clarified by centrifugation. Samples 
were immunoprecipitated with control normal serum (lanes 1), 
anti-pEX-2 (lanes 2), or anti-pEX-5 (lanes 3) and analyzed by 
NaDodS0 4 /8% PAGE followed by autoradiography with an inten- 
sifying screen (3 days for A and C, 10 days for B). 

phase CML patients, in which P210 c abl was detected at a 
10-fold higher level than P145 c ab! (refs. 10 and 11; Table 1). 
There was no large difference in level of chimeric mRNA and 
P210 c ab! expressed in four myeloid/erythroid-lineage Ph l - 
positive cell lines (K562, EM2, EM3, CML22, and BV173; 
refs. 9 and 11), despite a 4- to 5-fold amplification of 
a£/-related sequences in the K562 cell line. 

Detection of different levels of P210 c abl in Fig. 1 could be 
due to decreased phosphorylation of P210 c ~ ?bl , a lower level 
of P210 c ' abl synthesis, or altered stability of the protein. To 
help distinguish among these possibilities, the steady-state 
level of P210 c_abI in the cell lines was assayed by immuno- 
blotting. The results show that SK-CML7B1.33 (Fig. 2A, +) 
had a higher level of P210 C ftbl than P145, similar to the results 
with metabolic labeling (Fig. 1); We did not detect P210 c abl 
by immunoblotting with 2 x id 7 cells of line SK-CML8Bt-3 
(Fig. IB y +). Reconstruction experiments using dilutions of 
cell extracts showed that we could detect about 5-10% the 
level of P210 c abI expressed in the K562 cell line (data not 
shown). We infer that the steady-state level of P210 c abl in 
SK-CML8Bt-3 is lower than the level in SK-CML7Bt-33 by 
a factor of at least 10. The level of P210 c * abl detected in these 
assays correlated with the amount of P210 c abI tyrosine kinase 
activity that could be detected in vitro (data not shown). 

Different Levels of P210 c ~ ab ! Are Reflected In the Amount of 
Stable bcr-abl mRNA. To identify the basis for detection of 
variable levels of P210 c aW , we examined the production of 
the abl RNA. RNA blot hybridization analysis using a v-abl 
probe (Fig. 3) showed that the normal (£• and 7-kb c-abl 
mRNAs were present at a similar level in Ph l -positive and 
-negative cell lines derived from different patients. However, 
the 8-kb mRNA that encodes P210 c abl was detected at a 
10-fold higher level in SK-CML7Bt-33 (Fig. 3A, +) than in 
SK-CML16BM (B, +), which correlated with the relative 
level of P210 c * ab, detected in each cell line. Analysis of 
additional cell lines demonstrated that the level of 8-kb RNA 
directly correlated with the level of P210 c abI (Table 1). The 
variation in level of 8-kb RNA detected in these cell lines was 
not due to loss or gain of Ph\ because cytogenetic analysis 
confirmed the presence of Ph 1 in these cell lines (ref. 16 and 
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Table 1. Relative levels at. bcr-abl expression in Epsteinr-Barr. 
virus-immortalized B-cell lines and myeloid CML lines - , . 



8-kb 



Cell line* 


CML phase* 


Ph 1 * 


P210$ 


mKNA" 


SK-CML7BN-2 


BC 








SK-CML8BN-10 


Chronic 








SK-CML8BN-12 


Chronic . 








SK-CML16BN-1 


Chronic 








SK-CML35BN-1 


Chronic 








SK-CML7B5-33 


BC 


+ 


+ + + 


+ + + 


SK-CML21BI-1 


Acc 


+ . ■ 


+ + + 


+ + + 


SK-CML21Bt-6 


Acc 


+ 


+ + + 


+ + + 


SK-CML8Bt-3 


Chronic 


+ 


+ . 




SK-CML16BM 


Chronic 


+ 




+ 


SK-CML35Bt-2 


Chronic 


+ 


+ 


+ 


K562 


BC 


+ 


+ + + +.+ 


+ + + + + 


BV173 


BC 


+ 


+ + +.+.+ 


+ + + + + 


EM2 


BC 


+ 


+ +++ + 


+ + + + + 



♦Cell lines derived from CML patients by transformation with 
Epstein-Barr virus as described (16). Names of cell lines indicate 
patient number and Ph 1 status: SK-CML7Bt indicates a cell line 
derived from patient 7 that carries the 9;22 Ph 1 translocation; N 
indicates a normal karyotype. Myeloid-erythroid cell lines (K562, 
EM2, and BV173) are described in previous publications (9, 11, 22, 
33). 

^Status of patient at the time cell line was derived. BC, blast crisis; 
Acc, accelerated phase. 

♦Presence (+) or absence (-) of Ph 1 as demonstrated by karyotypic 
or Southern blot analysis. 

5p210 c " bl detected as described in legend to Fig. 1. B-cell lines 
derived from blast-crisis and accelerated-phase patients had levels 
of P210 3- to 5-fold higher (+ ++) than levels of P145, Chronic- 
phase-derived cell lines had P210 levels lower than or just equivalent 
(+) to the level of P145. Myeloid and erythroid lines had levels of 
P210 5- to 10-fold higher than P145 (+ + + + +). 

lEight-kilobase bcr-abl mRNA detected as described in legend to 
Fig. 2. Symbols: ±, borderline detectable; + + + + + , level of 8-kb 
mRNA 5- to 10-fold higher than that of the 6- and 7-kb z-abl mRNA 
species; + + + , level of 8-kb mRNA 3- to 5-fold higher than that of 
the 6- and 7-kb species; + , a level approximately equivalent to that 
of the 6- and 7-kb messages. 

data not shown). There was no difference in the copy number 
of aM-related sequences as judged by Southern blot analysis 
(Fig. 4). Only the K562 cell line control showed an amplifi- 
cation of abl sequences, as previously reported (22, 23). 
These combined data suggest that differential bcr-abl mRNA 
expression from a single gene template is responsible for the 
variable levels of P210 c - ftbl detected. This could be mediated 



— P210 



— P145 



Fig. 2. Analysis of steady-state abl protein levels by immuno- 
blotting. Cell extracts prepared from 2 x 10 7 cells of lines SK- 
CML7BN-2 (A,-). SK-CML7Bt-33 (A,+), SK-CML8BN-10 (5,-), 
and SK-CML8Bt-3 (B,+) were concentrated by immunoprecip- 
itation with anti-pEX-2 plus anti-pEX-5. Samples were then electro- 
phoresed in a NaDodS0 4 /8% polyacrylamide gel and transferred to 
nitrocellulose, using protease : facilitated transfer (18). abl proteins 
were detected using a mixture of two monoclonal antibodies directed 
against the pEX-2 and pEX-5 afr/-protein fragments produced in 
bacteria (19) as a probe and a peroxidase-conjugated goat anti-mouse 
second-stage antibody (Bio-Rad) for development. 





Fig. 3. Comparison of abl RNA levels in Ph l -positive and 
-negative B-cell lines. The levels of the normal 6- and 7-kb c-aW 
RNAs and the 8-kb 6cr-c6fRNA were analyzed by blot hybridization 
using a v-ofc/ probe. RNA was extracted from Ph'-negative lines 
SK-CML7BN-2 (A,-) and SK-CML16BN-1 (B f -) t from Ph l -pos- 
itive lines SK-CML6Bt-33 (A,+) and SK-CML16Bt-3 (B f +), and 
from line K562 (C,+) by the NaDodS0 4 /urea/phenol method (20). 
Polyadenylylated RNA was purified by oligo(dT) affinity chroma- 
tography ( and 15 *ig of each sample was electrophoresed in a 1% 
agarose/formaldehyde gel and then transferred to nitrocellulose. The 
blotted RNAs were hybridized with a nick-translated v-abl fragment 
probe (21) and then autoradiographed for 4 days. 

by factors influencing the transcription rate of the bcr-abl 
gene or the stability of the mRNA. 

DISCUSSION 

Several lines of evidence suggest that formation of Ph 1 is not 
' the primary event that affects the stem cell in CML. Patients 
have been identified that present with the clinical picture of 
CML but only later develop Ph 1 (1). This observation, 
coupled with studies of G6PD (glucose-6-phosphate dehy- 
drogenase)-heterozygous females with CML that demon- 
strate stem-cell clbnality by isozyme analysis among cell 
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Fig. 4. Southern blot analysis of abl sequences in Ph^positive 
and -negative B-cell lines.. High molecular weight DNA (15 jig) was 
digested with restriction endonuclease BamH\ t separated in a 0.8% 
agarose gel, and then transferred to nitrocellulose. The blotted DNA 
fragments were hybridized with a nick-translated, 2.4-kb Bgl II v-abl 
fragment (1.5 x 10 8 cpm/^tg; ref. 21) and exposed for 4 days. (A) 
Autoradiogram of oM-specific fragments in cell lines HL-60 Gane 1), 
EM2 (lane 2), K562 (lane 3), SK-CML7Bt-33 (lane 4), SK-CML8Bt-3 
(lane 5), SK-CML16BM (lane 6), SK-CML21Bt-6 (lane 7), SK- 
CML35Bt-2 (lane 8), SK-CML7BN-2 Gane 9), SK-CML8BN-2 (lane 
10), and SK-CML35BN-1 Gane 11). (JB) Ethidium bromide staining of 
agarose gel prior to transfer to nitrocellulose, showing the level of 
variation in amount of DNA loaded per lane. 
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populations that lack the Ph l marker, supports a secondary 
or complementary role for Ph 1 in the progression of the 
disease (24, 25). This chromosome marker is found in 
chronic, accelerated, and blast-crisis phases of the disease. It 
is likely that Ph 1 confers some growth advantage, since cells 
with the marker chromosome eventually predominate the 
marrow and peripheral blood even in chronic phase. During 
the phase of blast crisis, many patients develop additional 
chromosome abnormalities, including duplication of Ph 1 , a 
variety of trisomies, and complex translocations (26). This 
is suggestive evidence for Ph 1 being a necessary but not 
sufficient genetic change for the full evolution of the 
disease. 

The realization that one molecular result of Ph 1 is the 
generation of a chimeric bcr-abl protein with functional 
characteristics and structure analogous to the gag-abl trans- 
forming protein of the Abelson murine leukemia virus 
strengthens the argument for an important role of Ph 1 in the 
pathogenesis of CML. Although the Abelson virus is gener- 
ally considered a rapidly transforming retrovirus, its effects 
can range from overcoming growth factor requirements, to 
cellular lethality, to induction of highly oncogenic tumors in 
a number of hematopoietic cell lineages (27, 28). Even in the 
transformation of murine cell targets, there are several lines 
of evidence that suggest that the growth-promoting activity of 
the v-abl gene product is complemented by further cellular 
changes in the production of the malignant-cell phenotype 
(29-31). 

The regulation of bcr-abl gene expression is complex 
because the 5 r end of the gene is derived from the non-aM 
sequences, bcr t normally found on chromosome 22 (6). The 
level of stable message for the normal bcr gene and the 
normal abl gene are both much lower than the level of the 
bcr-abl message and protein from cell lines and clinical 
specimens derived from myeloid blast-crisis patients (5, 6, 
11). Therefore, the high level of bcr-abl expression cannot 
simply be attributed to the regulatory sequences associated 
with bcr. Possibly, creation of the chimeric gene disrupts the 
normal regulatory sequences and results in a higher level of 
expression. Variation in bcr-abl expression may result from 
secondary changes in the structure of the chimeric gene or 
function of /rfl/w-acting factors that occur during evolution of 
the disease. Our analysis of P210 c ' abl and the 8-kb mRNA in 
Epstein-Barr virus-transformed Prepositive B-cell lines 
demonstrates that stable message and protein levels from the 
bcr-abl gene can vary over a wide range. This variation does 
not result from a change in the number of bcr-abl templates 
secondary to gene amplification but more likely from changes 
in either transcription rate or mRNA stability. We suspect 
this range of bcr-abl expression is not limited to lymphoid 
cells. Analysis of peripheral blood leukocytes derived from 
an unusual CML patient who has been in chronic phase with 
myeloid predominance for 16 years showed a level of 
P210 c - abl one-fifth that of P145 C ftbr , as detected by metabolic 
labeling with [ 32 P]orthophosphate and immunoprecipitation 
(S.C., O.N.W., and P. Greenberg, unpublished observa- 
tions). Lower levels of expression of the chimeric mRNA 
have been demonstrated in clinical samples from chronic- 
phase CML patients compared to acute-phase CML patients 
(9). Others have reported chronic-phase patients with vari- 
able but, in some cases, relatively high levels of the bcr-abl 
mRNA (32). The sampling variation and the heterogenous 
mixture of cell types in clinical samples complicate such 
analyses. Further work is needed to evaluate whether there 
is a defined change in P210 c abl expression during the pro- 
gression of CML. It is interesting to note that among the 
limited sample of Ph ^positive B-cell lines we have examined 
(Table 1), we have seen higher levels of P210 c abI in those 
derived from patic v. at more advanced stages of the disease. 



It will be important to search for cell-type-specific mecha- 
nisms that might regulate expression of bcr-abl from Ph 1 . 
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Aneuploidy and cancer 

Subrata Sen, PhD 



Numeric aberrations in chromosomes, referred to as aneu- 
ploidy, is commonly observed in human cancer. Whether aneu- 
ploidy is a cause or consequence of cancer has long been 
debated. Three lines of evidence now make a compelling case 
for aneuploidy being a discrete chromosome mutation event 
that contributes to malignant transformation and progression 
process. First, precise assay of chromosome aneuploidy in 
several primary tumors with in situ hybridization and compara- 
tive genomic hybridization techniques have revealed that 
specific chromosome aneusomies correlate with distinct tumor 
phenotypes. Second, aneuploid tumor cell lines and in vitro 
transformed rodent cells have been reported to display an 
elevated rate of chromosome instability, thereby indicating that 
aneuploidy is a dynamic chromosome mutation event associ- 
ated with transformation of cells. Third, and most important, a 
number of mitotic genes regulating chromosome segregation 
have been found mutated in human cancer cells, implicating 
such mutations in induction of aneuploidy in tumors. Some of 
these gene mutations, possibly allowing unequal segregations 
of chromosomes, also cause tumorigenic transformation of 
cells in vitro. In this review, the recent publications investigat- 
ing aneuploidy in human cancers, rate of chromosome instabil- 
ity in aneuploidy tumor cells, and genes implicated in regulat- 
ing chromosome segregation found mutated in cancer cells 

are discussed. Curr Opin Oncol 2000, 12:82-88 © 2000 Lippincott Williams 
& Wilkins. Inc. 
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Cancer research over the past decade has firmly estab- 
lished that malignant cells accumulate a large number of 
genetic mutations that affect differentiation, prolifera- 
tion, and cell death processes. In addition, it is also 
recognized that most cancers are clonal, although they 
display extensive heterogeneity with respect to kary- 
otypes and phenotypes of individual clonal populations. 
It is estimated that numeric chromosomal imbalance, 
referred to as aneuploidy ; is the most prevalent genetic 
change recorded among over 20,000 solid tumors 
analyzed thus far 11). Phenotypic diversity of the clonal 
populations in individual tumors involve differences in 
morphology, proliferative properties, antigen expression, 
drug sensitivity, and metastatic potentials. It has been 
proposed that an underlying acquired genetic instability 
is responsible for the multiple mutations detected in 
cancer cells thai lead to tumor heterogeneity and 
progression [2]. In a somewhat contradictory argument, 
it has also been suggested that clonal expansion due to 
selection of cells undergoing normal rates of mutation 
can explain malignant transformation and progression 
process in humans [3]. Acquired genetic instability, 
nonetheless, is considered important for more rapid 
progression of the disease [4*»j. Although the original 
hypothesis on genetic instability in cancer primarily 
focused on chromosome imbalances in the form of aneu- 
ploidy in tumor cells, the actual relevance of such muta- 
tions in cancer remains a controversial issue. 

Whether or not aneuploidy contributes to the malignant 
transformation and progression process has long been 
debated. A prevalent idea on generics of cancer referred 
to as "somatic gene mutation hypothesis" contends that 
gene mutations at the nucleotide level alone can cause 
cancer by either activating cellular proto-oncogencs to 
dominant cancer causing oncogenes and/or by inactivat- 
ing growth inhibitory tumor suppressor genes. In this 
scheme of things chromosomal instability in the form of 
aneuploidy is a mere consequence rather than a cause of 
malignant transformation and progression process. 

In this review, some of the recent observations on the 
subject are discussed and compelling evidence is 
provided to suggest that aneuploidy is a distinct form of 
genetic instability in cancer that frequently correlates 
with specific phenotypes and stages of the disease. 
Furthermore, discrete genetic targets affecting chromo- 
somal stability in cancer cells, recently identified, are 
also discussed. These data provide a new direction 
toward elucidating the molecular mechanisms responsi- 
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ble for induction of aneuploidy in cancer and may even- 
tually be exploited as novel therapeutic targets in the 
future. 

Genetic alterations in cancer 

Alterations in many genetic loci regulating growth, 
senescence, and apoptosis, identified in tumor cells, 
have led to the current understanding of cancer as a 
genetic disease. The genetic changes identified in 
tumors include: subtle mutations in genes at the 
nucleotide level; chromosomal translocations, leading to 
structural rearrangements in genes; and numeric 
changes in either partial segments of chromosomes or 
whole chromosomes (aneuploidy) causing imbalance in 
gene dosage. 

For the purpose of this review, both segmental and whole 
chromosome imbalances leading to altered DNA dosage 
in cancer cells are included as examples of aneuploidy. 

Incidence of aneuploidy in cancer 

Evidence of aneuploidy involving one or more chromo- 
somes have been commonly reported in human tumors. 
Although these observations were initially made using 
classic cytogenetic techniques late in a tumor's evolu- 
tion and were difficult to correlate with cancer progres- 
sion, more recent studies have reported association of 
specific nonrandom chromosome aneuploidy with 
different biologic properties such as loss of hormone 
dependence and metastatic potential [5]. 

Classic cytogenetic studies performed on tumor cells 
had serious limitations in scope because they were 
applicable only to those cases in which mitotic chromo- 
somes could be obtained. Because of low spontaneous 
rates of cell division in primary tumors, analyses 
depended on cells either derived selectively from 
advanced metastases or those grown m vitro for variable 
periods of time. In both instances, metaphases analyzed 
represented only a subset of primary tumor cell popula- 
tion. Two major advances in cytogenetic analytic tech- 
niques, in situ hybridization (ISH) and comparative 
genomic hybridization (CGH), have allowed better reso- 
lution of chromosomal aberrations in freshly isolated 
tumor cells [6], ISH analyses with chromosome-specific 
DNA probes, a powerful adjunct to metaphasic analysis, 
allows assessment of chromosomal anomalies within 
tumor ceil populations in the contexts of whole nuclear 
architecture and tissue organization. CGH allows 
genome wide screening of chromosomal anomalies 
without the use of specific probes even in the absence 
of prior knowledge of chromosomes involved. Although 
both techniques have certain limitations in terms of 
their resolution power, they nonetheless provide a 
better approximation of chromosomal changes occurring 
among tumors of various histology, grade, and stage 



compared with what was possible with the classic cyto- 
genetic techniques. Genomic ploidy measurements 
have also been performed at the DNA level with flow 
cytometry and cytofluorometric methods. Although 
these assays underestimate chromosome ploidy due to a 
chromosomal gain occasionally masking a chromosomal 
loss in the same cell, several studies using these 
methods have supported the conclusion that DNA 
aneuploidy closely associates with poor prognosis in 
various cancers [7,8]. This discussion of some recent 
examples published on aneuploidy in cancer includes 
discussion of studies dealing with DNA ploidy measure- 
ments as well. Most of these observations arc correlative 
without direct proof of specific involvement of genes on 
the respective chromosomes. Identification of putative 
oncogenes and tumor suppressor genes on gained and 
lost chromosomes in aneuploid tumors, however, are 
providing strong evidence that chromosomes involved in 
aneuploidy play a critical role in the tumorigenic 
process. 

In renal tumors, either segmental or whole chromosome 
aneuploidy appears to be uniquely associated with 
specific histologic subtypes [9]. Tumors from patients 
with hereditary papillary renal carcinomas (HPRC) 
commonly show trisomy of chromosome 7, when 
analyzed by CGH. Germlinc mutations of a putative 
oncogene MET have been detected in patients with 
HPRC. A recent study [10] has demonstrated that an 
extra copy of chromosome 7 results in nonrandom dupli- 
cation of the mutant MET allele in HPRC, thereby 
implicating this trisomy in tumorigenesis. The study 
suggested that mutation of MET may render the cells 
more susceptible to errors in chromosome replication, 
and that clonal expansion of cells harboring duplicated 
chromosome 7 reflects their proliferative advantage. In 
addition to chromosome 7, trisomy of chromosome 17 in 
papillary tumors and also of chromosome 8 in mesoblas- 
tic nephroma are commonly seen. Association of specific 
chromosome imbalances with benign and malignant 
forms of papillary renal tumors, therefore, not only 
contribute to an understanding of tumor origins and 
evolution, but also implicate aneuploidy of the respec- 
tive chromosomes in the tumorigenic transformation 
process. 

In colorectal tumors, chromosome aneuploidy is a 
common occurrence. In fact, molecular allelotyping 
studies have suggested that limited karyotyping data 
available from these tumors actually underestimate the 
true extent of these changes. Losses of heterozygosity 
reflecting loss of the maternal or paternal allele in 
tumors are widespread and often accompanied by a gain 
of the opposite allele. Therefore, for example, a tumor 
could lose a maternal chromosome while duplicating 
the same paternal chromosome, leaving the tumor cell 
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with a normal karyotype and ploidy but an aberrant 
allelotypc. It has been estimated that cancer of the 
colon, breast, pancreas, or prostate may lose an average 
of 25% of its alleles- It is not unusual to discover that a 
tumor has lost over half of its alleles [4J. In clinical 
settings, DNA ploidy measurements have revealed that 
DNA ancuploidy indicates high risk of developing 
severe premalignant changes in patients with ulcerative 
colitis, who are known to have an increased risk of 
developing colorectal cancer [LI]. DNA aneuploidy has 
been found to be one of the useful indicators of lymph 
node metastasis in patients with gastric carcinoma and 
associated with poor outcome compared with diploid 
cases [12,13]. CGH analyses of chromosome aneu- 
ploidy, on the other hand, was reported to correlate gain 
of chromosome 20q with high tumor S phase fractions 
and loss of 4q with low tumor apoptotic indices [14]. 
Ancuploidy of chromosome 4 in metastatic colorectal 
cancer has recently been confirmed in studies that used 
unbiased DNA fingerprinting with arbitrarily primed 
polymerase chain reactions to detect moderate gains 
and losses of specific Chromosomal DNA sequences 
[15]. The molecular karyotype (amplotype) generated 
from colorectal cancer revealed that moderate gains of 
sequences from chromosomes 8 and 13 occurred in 
most tumors, suggesting [hat overrepresentation of 
these chromosomal regions is a critical step for metasta- 
tic colorectal cancer. 

In addition to being implicated in tumorigenesis and 
correlated with distinct tumor phenotypes, chromosome 
ancuploidv has been used as a marker of risk assessment 
and prognosis in several other cancers. The potential 
value of aneuploidy as a noninvasive tool to identify 
individuals at high risk of developing head and neck 
cancer appears especially promising. Interphase fluores- 
cence in situ hybridization (FISH) revealed extensive 
aneuploidy in tumors from patients with head and neck 
squamous cell carcinomas (HNSCC) and also in clini- 
cally normal distant oral regions from the same individu- 
als [16,17]. It has been proposed that a panel of chromo- 
some probes in FISH analyses may serve as an 
important tool to detect subclinical tumorigenesis and 
for diagnosis of residual disease. The presence of aneu- 
ploid or tetraploid populations is seen in 90% to 95% of 
esophageal adenocarcinomas, and when seen in 
conjunction with Barrett's esophagus, a premalignant 
condition, predicts progression of disease [18,19]. 
Chromosome ploidy analyses in conjunction with loss of 
heterozygosity and gene mutation studies in Barrett's 
esophagus reflect evolution of neoplastic cell lineages in 
vivo [20]. Evolution of neoplastic progeny from Barrett's 
esophagus following somatic genetic mutations 
frequently involves bifurcations and loss of heterozygos- 
ity at several chromosomal loci leading to ancuploidy 
and cancer. Accordingly, it is hypothesized that during 



tumor cell evolution diploid cell' progenitors with 
somatic genetic abnormalities undergo expansion with 
acquired genetic instability. Such instability, often 
manifested in the form of increased incidence of aneu- 
ploidy, enters a phase of clonal evolution beginning in 
premalignant cells that proceeds over a period of time 
and occasionally leads to malignant transformation. The 
clonal evolution continues even after the emergence of 
cancer. 

The significance of DNA and chromosome ancu- 
ploidy in other human cancers continue to be evalu- 
ated. Among papillary thyroid carcinomas, aneuploid 
DNA content in tumor cells was reported to correlate 
with distant metastases, reflecting worsened progno- 
sis [21]. Genome wide screening of follicular thyroid 
tumors by CGH, on the other hand, revealed frequent 
loss of chromosome 22 in widely invasive follicular 
carcinomas [22]. Chromosome copy number gains in 
invasive neoplasm compared with foci of ductal carci- 
noma in situ (DCIS) with similar histology have been 
proposed to indicate involvement of aneuploidy in 
progression of human breast cancer [23]. ISH analyses 
of cervical intraepithelial neoplasia has provided 
suggestive evidence that chromosomes 1, 7 and X 
aneusomy is associated with progression toward cervi- 
cal carcinoma [24]. 

Although the prognostic value of numeric aberrations 
remains a matter of debate in human hematopoietic 
neoplasia, there have been recent studies to suggest that 
the presence of monosomy 7 defines a distinct subgroup 
of acute myeloid leukemia patients 125]. It is interesting 
in this context that therapy-related myelodysplastic 
syndromes have been reported to display monosomy 5 
and 7 karyotypes, reflecting poor prognosis [26]. 

The clinical observations, mentioned previously, are 
supported by in vitro studies in human and rodent cells in 
which aneuploidy is induced at early stages of transforma- 
tion [27,28]. It is even suggested that ancuploidy may 
cause cell immortalization, in some instances, that is a 
critical step prececding transformation. 

Finally, in an interesting study to develop transgenic 
mouse models of human chromosomal diseases, chromo- 
some segment specific duplication and. deletions of the 
genome were reported to be constructed in mouse 
embryonic stem cells [29]. Three duplications for a 
portion of mouse chromosome 11 syntenic with human 
chromosome 17 were established in the mouse 
germline. Mice with 1Mb duplication developed corneal 
hyperplasia and thymic tumors. The findings represent 
the first transgenic mouse model of ancuploidy of a 
defined chromosome segment that documents the direct 
role of chromosome aneusomy in tumorigenesis. 



Aneuploidy and cancer Sen 85 



Aneuploidy as "dynamic cancer-causing 
mutation" instead of a "consequential state" 
in cancer 

According to the hypothesis previously discussed, aneu- 
ploidy represents either a "gain of function" or 'loss of 
function" mutation at the chromosome level with a 
causative influence on the tumorigenesis process. The 
hypothesis, however, is based only on circumstantial 
evidence even though existence of aneuploidy is corre- 
lated with different tumor phenotypes. The existence of 
numeric chromosomal alterations in a tumor does not 
mean that the change arose as a dynamic mutation due 
to genomic instability, because several factors could lead 
to consequential aneuploidy in tumors, also. Although 
aneuploidy as a dynamic mutation due to genomic insta- 
bility in tumor cells would occur at a certain measurable 
rate per cell generation, a consequential state of aneu- 
ploidy in tumors may not occur at a predictable rate 
under similar conditions or in tumors with similar 
phenotypes. In addition to genomic instability, differ- 
ences in environmental factors with selective pressure, 
could explain high incidence of aneuploidy and other 
somatic mutations in tumors compared with normal cells 
(4]. These include humoral, ceil substratum, and cell- 
cell interaction differences between tumor arid normal 
cell environments. It could be argued that despite 
similar rates of spontaneous aneuploidy induction in 
normal and tumor cells, the latter are selected to prolif- 
erate due to altered selective pressure in the tumor cell 
environment, whereas the normal cells are eliminated 
through activation of apoptosis. Alternatively, of course, 
one could postulate that selective expression or ovcrex- 
prcssion of anti-apoptotic proteins or inactivation of 
proapoptotic proteins in tumor cells may counteract 
default induction of apoptosis in G2/M phase cells 
undergoing missegregation of chromosomes. Recent 
demonstration of overexpression of a G2/M phase anti- 
apoptotic protein survivin in cancer cells [30] suggests 
that this protein may favor aberrant progression of aneu- 
ploid transformed cells through mitosis. This would 
then lead to proliferation of aneuploid cell lineages, 
which may undergo clonal evolution. 

To ascertain that aneuploidy is a dynamic mutational 
event, various human tumor cell lines and transformed 
rodent cell lines have been analyzed for the rate of 
aneuploidy induction. When grown under controlled in 
vitro conditions, such conditions ensure that environ- 
mental factors do not influence selective proliferation of 
cells with chromosome instability. In one study, 
Lcngauer et aL [31 •] provided unequivocal evidence by 
FISH analyses that losses or gains of multiple chromo- 
somes occurred in excess of 10" 2 per chromosome per 
generation in aneuploid colorectal cancer cell lines,. The 
study further concluded that such chromosomal instabil- 
ity appeared to be a dominant trait. Using another in 



vitro model system of Chinese hamster embryo (CHE) 
cells, Duesberg et aL [32»J have also obtained similar 
results. With clonal cultures of CHE cells, transformed 
with nongenotoxic chemicals and a mitotic inhibitor, 
these authors demonstrated that the overwhelming 
majority of the transformed colonies contained more 
than 50% aneuploid cells, indicating that aneuploidy 
would have originated from the same cells that under- 
went transformation. AH the transformed colonies tested 
were tumorigenic. It was further documented that the 
ploidy factor representing the quotient of the modal 
chromosome number divided by the normal diploid 
number, in each clone, correlated directly with the 
degree of chromosomal instability. Therefore, chromo- 
somal instability was found proportional to the degree of 
aneuploidy in the transformed cells and the authors 
hypothesized that aneuploidy is a unique mechanism of 
simultaneously altering and destabilizing, in a massive 
manner, the normal cellular phenotypes. In the absence 
of any evidence that the transforming chemicals used in 
the study did not induce other somatic mutations, it is 
difficult to rule out the contribution of such mutations 
in the transformation process. These results nonetheless 
make a strong case for aneuploidy being a dynamic chro- 
mosome mutation event intimately associated with 
cancer. 

Aneuploidy versus somatic gene mutation in 
cancer 

The idea that numeric chromosome imbalance or aneu- 
ploidy is a direct cause of cancer was proposed at the 
turn of the century by Theodore Boveri [33]. However, 
the hypothesis was largely ignored over the last several 
decades in favor of the somatic gene mutation hypothe- 
sis, mentioned earlier. Evidence accumulating in the 
literature lately on specific chromosome aneusomies 
recognized in primary tumors, incidence of aneuploidy 
in cells undergoing transformation, and aneuploid tumor 
cells showing a high rate of chromosome instability have 
led to the rejuvenation of Boveri's hypothesis. The 
concept has recently been discussed as a "vintage wine 
in a new bottle" [34 * ). The author points out that 
except for rare cancers caused by dominant retroviral 
oncogenes, diploidy docs not seem to occur in solid 
tumors, whereas aneuploidy is a rule rather than excep- 
tion in cancer. 

Aneuploidy as an effective mutagenic mechanism 
driving tumor progression, on the other hand, is being 
recognized as a viable solution to the paradox that with 
known mutation rate in non-germline cells (~10~ 7 per 
gene per cell generation) tumor cell lineages cannot 
accumulate enough mutant genes during a human life- 
time [35]. The concept is gaining significant credibility 
since genes that potentially affect chromosome segrega- 
tion were found mutated in human cancer. Some of 
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these genes have also been shown to have transforming 
capability in in vitro assays. Selected recent publications 
describing the findings are being discussed below in 
reference to the mitotic targets potentially involved in 
inducing chromosome segregation anomalies in cells. 

Potential mitotic targets and molecular 
mechanisms of aneuploidy 

Because aneuploidy represents numeric imbalance in 
chromosomes, it is reasonable to expect that aneuploidy 
arises due to missegregation of chromosomes during cell 
division. There are many potential mitotic targets, 
which could cause unequal segregation of chromosomes 
(Fig. 0- Recent investigations have identified several 
genes involved in regulating these mitotic targets and 
mitotic checkpoint functions, which can be implicated 
in induction of aneuploidy in tumor cells. This discus- 
sion is restricted to those mitotic targets and checkpoint 
genes whose abnormal functioning has been observed in 
cancer or has been shown to cause tumorigenic transfor- 
mation of cells, in recent years. The role of telomeres is 
discussed elsewhere in this issue. For a more detailed 
description of the components of mitotic machinery and 
their possible involvement in causing chromosome 
segregation abnormalities in tumor cells, readers may 
refer to a recently published review [36*]. 

Among, the mitotic targets implicated in cancer, centro- 
some defects have been observed in a wide variety of 
malignant human tumors. Centrosomes play a central role 
in organizing the microtubule network in interphase cells 
and mitotic spindle during cell division. Multipolar 
mitotic spindles have been observed in human cancers in 
situ and abnormalities in the form of supernumerary 



Figure 1. Potential mitotic targets causing aneuploidy in 
oncogenesis 
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Diagram illustrates that defects in several processes involving chromosomal, 
spindlo microtubule, and centrosomal targets, in addition to abnormal cytokine- 
sis, may cause unequal partitioning of chromosomes during mitosis, leading to 
aneuploidy. Recently obtained evidence in favor of some of these possibilities is 
discussed in the text. 



centrosomes, centrosomes of aberrant size and shape as 
well as aberrant phosphorylation of ccntrosome proteins 
have been reported in prostate, colon, brain, and breast 
tumors [37,38]. In view of the findings that abnormal 
centrosomes retain the ability to nucleate microtubules in 
vitro, it is conceivable that cells with abnormal centro- 
sorncs may misscgregate chromosomes producing ancu- 
ploid cells. The molecular and genetic bases of abnormal 
ccntrosome generation and the precise pathway through 
which they regulate the chromosome segregation process 
remain to be elucidated. Recent discovery of a ccntro- 
some-associated kinase STK l5/BTAK/aurora2, naturally 
amplified and ovcrexprcssed in human cancers, has raised 
the interesting possibility that aberrant expression of this 
kinase is critically involved in abnormal ccntrosome func- 
tion and unequal chromosome segregation in tumor cells 
[39,40). Exogenous expression of the kinase in rodent and 
human cells was found to correlate with an abnormal 
number of centrosomes, unequal partitioning of chromo- 
somes during division, and tumorigenic transformation of 
cells. It is relevant in this context to mention that the 
Xenopus homofoguc of human STK15/BTAK/aurora2 
kinase has recently been shown to phosphorylate a micro- 
tubule motor protein XlEgS, the human orthologue of 
which is known to participate in the centrosomc separa- 
tion during mitosis [41]. Findings on STK15/aurora2 
kinase, thus, provide an interesting lead to a possible 
molecular mechanism of centrosome's role in oncogene- 
sis. Centrosomes have, of late, been implicated in onco- 
genesis from studies revealing supernumerary centro- 
somes in />5J-deficient fibroblasts and overexprcssion of 
another ccntrosome kinase PLKI being detected in 
human non-small cell lung cancer [42]. 

One of the critical events that ensures equal partition- 
ing of the chromosomes during mitosis is the proper 
and timely separation of sister chromatids that are 
attached to each other and to the mitotic spindle. 
Untimely separation of sister chromatids has been 
suspected as a cause of aneuploidy in human tumors. 
Cohesion between sister chromatids is established 
during replication of chromosomes and is retained until 
the next metaphase/anaphasc transition. It has been 
shown that during mctaphase-anaphase transition, the 
anaphase promoting complex/cyclosome triggers the 
degradation of a group of proteins called securins that 
inhibit sister chromatid separation. A vertebrate securin 
(v-securin) has recently been identified that inhibits 
sister chromatid separation and is involved in transfor- 
mation and tumorigenesis. Subsequent analysis 
revealed that the human securin is identical to the 
product of the gene called pituitary tumor transforming 
gene, which is ovcrexpressed in some tumors and 
exhibits transforming activity in N1H3T3 cells. It is 
proposed that elevated expression of the v-securin may 
contribute to generation of malignant tumors due to 
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chromosome gain or loss produced by errors in chro- 
matid separation [43 # ]. 

Normal progression through mitosis during prophase to 
anaphase transition is monitored at least at two check- 
points. One checkpoint operates during early prophase 
at G2 to mctaphase progression while the second 
ensures proper segregation of chromosomes during 
mctaphase to anaphase transition. Several mitotic 
checkpoint genes responding to mitotic spindle defects 
have been identified in yeast. The metaphasc-anaphase 
transition is delayed following activation of this check- 
point during which kinetochores remain unattached to 
the spindle. The signal is transmitted through a kineto- 
chore protein complex consisting of Mpslp and several 
Mad and Bub. proteins [44]. It is expected that for 
unequal chromosome segregation to be perpetuated 
through cell proliferation cycles giving rise to aneu- 
ploidy, checkpoint controls have to be abrogated. 

Following this logic, Vogclstein et al [45»] hypothesized 
that aneuploid tumors would reveal mutation in mitotic 
spindle checkpoint genes. Subsequent studies by these 
investigators have proven the validity of this hypothesis 
and a small fraction of human colorectal cancers have 
revealed the presence of mutations in either, hBubl or 
hBubRl checkpoint genes. It was further revealed that 
mutant BUBl could function in a dominant negative 
manner conferring an abnormal spindle checkpoint 
when expressed cxogenously. lnactivation of spindle 
checkpoint function in virally induced leukemia has also 
recently been documented following the finding that 
hMADl checkpoint protein is targeted by the Tax 
protein of the human T-cell leukemia virus type 1. 
Abrogation of hMADl function leads to multinucleation 
and aneuploidy [46]. 

In addition to mitotic spindle checkpoint defects, failed 
DNA damage checkpoint function in yeast is frequently 
associated with aberrant chromosome segregation as 
well. It, therefore, appears intriguing yet relevant that 
the human BRCAJ gene, proposed to be involved in 
DNA damage checkpoint function, when mutated by a 
targeted deletion of exon 11 led to defective G2/M cell 
cycle checkpoint function and genetic instability in 
mouse embryonic fibroblasts 147]. The cells revealed 
multiple functional ccntrosomes and unequal chromo- 
some segregation and aneuploidy. Although the molecu- 
lar basis for these abnormalities is not known at this 
time, it raises the interesting possibility that such an 
aneuploidy-drivcn mechanism may be involved in 
tumorigencsis in individuals carrying germlinc muta- 
tions of B RCA I gene. 



Conclusion 

Growing evidence from human tumor cytogenetic inves- 
tigations strongly suggest that aneuploidy is associated 
with the development of tumor phenotypes. Clinical 
findings of correlation between aneuploidy and tumori- 
genesis arc supported by studies with in vitro grown 
transformed cell lines. Molecular genetic analyses of 
tumor cells provide credible evidence that mutations in 
genes controlling chromosome segregation during 
mitosis play a critical role in causing chromosome insta- 
bility leading to aneuploidy in cancer. Further elucida- 
tion of molecular and physiologic bases of chromosome 
instability and aneuploidy induction could lead to the 
development of new therapeutic approaches for 
common forms of cancer. 
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The DEAD box gene, DDX1, is a putative RNA helicase 
that is co-amplified with MYCN in a subset of retinoblas- 
toma (RB) and neuroblastoma (NB) tumors and cell 
lines. Although gene amplification usually involves hun- 
dreds to thousands of kilobase pairs of DN A, a number of 
studies suggest that co-amplified genes are only overex- 
pressed if they provide a selective advantage to the cells 
in which they are amplified. Here, we further character- 
ize DDXl by identifying its putative transcription and 
translation initiation sites. We analyze DDX1 protein 
levels in MYCN/DDX1 -amplified NB and RB cell lines 
using polyclonal antibodies specific to DDX1 and show 
that there is a good correlation with DDX1 gene copy 
number, DDXl transcript levels, and DDX1 protein lev- 
els in all cell lines studied. DDXl protein is found in both 
the nucleus and cytoplasm of DDAJ-amplified lines but 
is localized primarily to the nucleus of nonamplified 
cells. Our results indicate that DDXl may be involved in 
either the formation or progression of a subset of NB 
and RB tumors and suggest that DDXl normally plays a 
role in the metabolism of RNAs located in the nucleus of 
the cell. 



DEAD box proteins are a family of putative RNA helicases 
that are characterized by eight conserved amino acid motifs, 
one of which is the ATP hydrolysis motif containing the core 
amino acid sequence DEAD (Asp-Glu-Ala-Asp) (1-3). Over 40 
members of the DEAD box family have been isolated from a 
variety of organisms including bacteria, yeast, insects, amphib- 
ians, mammals, and plants. The prototypic DEAD box protein 
is the translation initiation factor, eukaryotic initiation factor 
4A, which, when combined with eukaryotic initiation factor 4B, 
unwinds double-stranded RNA (4). Other DEAD box proteins, 
such as p68, Vasa, and An3, can effectively and independently 
destabilize/unwind short RNA duplexes . in vitro (5-7). Al- 
though some DEAD box proteins play general roles in cellular 
processes such as translation initiation (eukaryotic initiation 
factor 4A (4)), RNA splicing (PRP5, PRP28, and SPP81 in yeast 
(8-10)), and ribosomal assembly (SrmB in Escherichia coli 
(11)), the function of. most DEAD box proteins remains un- 
known. Many of the DEAD box proteins found in higher eu- 
karyotes are tissue- or stage-specific. For example, PL 10 
mRNA is expressed only in the male germ line, and its product 
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has been proposed to have a specific role in translational reg- 
ulation during spermatogenesis (12). Vasa and .ME31B are 
maternal proteins that may be involved in embryogenesis (13, 
14), p68, found in dividing cells (15), is believed to be required 
for the formation of nucleoli and may also have a function in 
the regulation of cell growth and division (16, 17). Other DEAD 
box proteins are implicated in RNA degradation, mRNA stabil- 
ity, and RNA editing (18-20). 

The human DEAD box protein gene DDXl 1 was identified by 
differential screening of a cDNA library enriched in transcripts 
present in the two RB cell lines Y79 and RB522A (21). The 
longest DDXl cDNA insert isolated from this library was 2.4 kb 
with an open'reading frame from position 1 to 2201. All eight 
conserved motifs characteristic of DEAD box proteins are found 
in the predicted amino acid sequence of DDXl as well as a 
region with homology to the heterogeneous nuclear ribonucle- 
oprotein U, a protein believed to participate in the processing of 
heterogeneous nuclear RNA to mRNA (22, 23). The region of 
homology to heterogeneous nuclear ribonucleoprotein U spans 
128 amino acids and is located between the first two conserved 
DEAD box protein motifs, la and lb. 

The proto-oncogene MYCN encodes a member of the MYC 
family of transcription factors that bind to an E box element 
(CACGTG) when dimerized with the MAX protein (24, 25). The 
MYCN gene is amplified and overexpressed in approximately 
one-third of all NB tumors (26, 27). Amplification of MYCN is 
associated with rapid tumor progression and a poor clinical 
prognosis (26, 27). MYCN overexpression is usually achieved 
by increasing gene copy number rather than by up-regulating 
basal expression of MYCN (27, 28). Because gene amplification 
involves hundreds to thousands of kilobase pairs of contiguous 
DNA (29-32), it is possible that co-amplification of a gene 
located in proximity to MYCN may contribute to the poor 
clinical prognosis of MYCiV- amplified tumors. The DDXl gene 
maps to the same chromosomal band as MYCN, 2p24, and is 
located -400 kb telomeric to the MYCN gene (33-36). All four 
A/YCiV-amplified RB tumor cell lines tested to date are ampli- 
fied for DDXl (21), 2 while approximately two- thirds of NB cell 
lines and 38-68% of NB tumors are co-amplified for both genes 
(37-39). George et al. (39) found a significant decrease in the 
mean disease-free survival of patients with DDXl I AfYCAf- am- 
plified NB tumors compared with MYCN- amplified tumors. 
Similarly, Squire et al. (38) observed a trend toward a worse 
clinical prognosis when both genes were amplified in the tu- 
mors of NB patients. To date, there have been no reports of a 



1 The abbreviations used are: DDXl, DEAD bo* 1; NB, neuroblas- 
toma; RB, retinoblastoma; RACE, rapid amplification of cDNA ends; 
PAGE, polyacrylamide gel electrophoresis; nt, nucleotide(s); MOPS, 
4-morpholinepropanesulfonic acid; bp. base pair(s); kb, kilobase(s) or 
kilobase pair(s). 

2 R. Godbout, unpublished results. 
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tumor amplified only for DDXl, and the role that this gene 
plays in cancer formation and progression is not known. 

Because of the high rate of rearrangements in amplified 
DNA (31, 40), it is unlikely that a gene located -400 kb from 
the MYCN gene will be consistently amplified as an intact unit 
unless its product provides a growth advantage to the cell. 
Based on Southern blot analysis, the DDXl gene extends over 
more than 30 kb, and there are no gross rearrangements of this 
gene in DDXl -amplified tumors (21, 38). Furthermore, there is 
a good correlation between DDX1 transcript levels and gene 
copy number in the tumors analyzed to date. However, we need 
to show that DDX1 protein is overexpressed in DDX1 -amplified 
tumors if we are to entertain the possibility that this protein 
plays a role in the tumorigenic process. Here, we isolate and 
characterize the 5 '-end oiDDXl mRNA and extend the DDX1 
cDNA sequence by -300 nt. We identify the predicted initia- 
tion codon of DDX1 and generate antisera that specifically 
recognize DDX1 protein. We analyze levels of DDX1 protein in 
both DDX1 -amplified and nonamplified RB and NB tumors and 
study the subcellular location of this protein in the cell. 

MATERIALS AND METHODS 

Library Screening — A human fetal brain cDNA library (Stratagene) 
was screened using a 320-bp DNA fragment from the 5' -end of the 
2.4-kb DDX1 cDNA previously described (23). Phagemids containing 
positive inserts were excised from A ZAP II following the supplier's 
directions. The ends of the cDNA inserts were sequenced using the 
dideoxynucleotide chain termination method with T7 DNA polymerase 
(Amersham Pharmacia Biotech). 

A human placenta genomic library (CLONTECH) was screened with 
the 5' -end of DDX1 cDNA. Positive plaques were purified, and the 
genomic DNA was analyzed using restriction enzymes and Southern 
blotting. EcoRI-digested DNA fragments from these clones were sub- 
cloned into pBluescript and digested with exonuclease III and mung 
bean nuclease to obtain sequentially deleted clones. The exon/intron 
map of the 5' portion of the DDX1 gene was obtained by comparing the 
sequence of DDX1 cDNA with that of the genomic DNA. 

Rapid Amplification of cDNA Ends (RACE)— We used the Ampli- 
FINDER RACE kit (CLONTECH) to extend the 5'-end of DDXl cDNA. 
Briefly, two jig of poly(A) + RNA isolated from RB522A was reverse 
transcribed at 52 °C using either primer PI or P3 (Fig. 1A). The RNA 
template was hydrolyzed, and excess primer was removed. A single- 
stranded AmpliFINDER anchor containing an EcoRl site was ligated to 
the 3'-end of the cDNA using T4 RNA ligase. The cDNA was amplified 
using either primer P2 or P4 (Fig. 1A) and AmpliFINDER anchor 
primer. RACE products were cloned into pBluescript. 

Primer Extension — Poly(A) + RNAs were isolated from RB and NB 
cell lines as described previously (21, 38). The 21-nt primers 5'-TTCGT- 
TCTGGGCACCATGTGT-3' (primer P4 in Fig. 1A) and 5'-TGGGAC- 
CTAQGGCTTCTGGAC-3' (primer P3 in Fig. 1A) were end-labeled with 
l?- 32 P]ATP (3000 Ci/mmol; Mandel Scientific) and T4 polynucleotide 
kinase. Each of the labeled primers was annealed to 2 fig of poly(A) + 
RNA at 45 °C for 90 min, and the cDNA was extended at 42 °C for 60 
min using avian myeloblastosis virus reverse transcriptase (Promega). 
The primer extension products were heat-denatured and run on a 8% 
polyacrylamide gel containing 7 M urea in IX TBE buffer. AG + A 
. sequencing ladder served as the size standard. 

Si Nuclease Protection Assay — The Si nuclease protection assay to 
map the transcription initiation site of DDXl was performed as de- 
scribed by Favaloro et al. (41). The DNA probe was prepared by digest- 
ing genomic DNA spanning the upstream region of DDXl and exon 1 
with Aval, labeling the ends with | Y - 32 P]ATP (3000 Ci/mmol) and 
polynucleotide kinase, and removing the label from one of the ends by 
digesting the DNA with Sphl (Fig. 4). The RNA samples were resus- 
pended in a hybridization mixture containing 80% formamide, 40 mM 
PIPES, 400 mM NaCl, 1 mM EDTA, and the heat-denatured Sphl-Aval 
probe labeled at the Aval site. The samples were incubated at 45 °C for 
16 h and digested with 3000 units/ml Si nuclease (Boehringer Mann- 
heim) for 60 min at 37 °C. The samples were precipitated with ethanol; 
resuspended in 80% formaldehyde, TBE buffer, 0.1% bromphenol blue, 
xylene cyanol; denatured at 90 °C for 2 min; and electrophoresed in a 7 
m urea, 8% polyacrylamide gel in TBE buffer. 

Northern and Southern Blot Analysis— Poly(A)" RNAs were isolated 
from RB and NB cell lines as described previously (21, 38). Two fig of 



poly(A) + RNA/lane were electrophoresed in a 6% formaldehyde, 1.5% 
agarose gel in MOPS buffer (20 mM MOPS, 5 mM sodium acetate, 1 mM 
EDTA, pH 7.0) and transferred to nitrocellulose filter in 3 M sodium 
chloride, 0.3 M sodium citrate. The filters were hybridized to the follow- 
ing DNA probes, 32 P-labeled by nick translation: (i) a 1.6-kb J?coRI 
insert from DDXl cDNA clone 1042 (21), (ii) a 260-bp cDNA fragment 
spanning the 3'-end of DDXl exon 1 as well as exons 2 and 3, (iii) a 
160-bp fragment derived from the 5'-end of DDXl exon 1, and (iv) 
a-actin cDNA to control for lane to lane variation in RNA levels. Filters 
were hybridized and washed under high stringency. Southern blot 
analysis was as described previously (21). 

Preparation of Anti-DDXl Antiserum — To prepare antiserum to the 
C terminus of the DDXl protein, we inserted a 1.8-kb EcoRI fragment 
from bp 848 to 2668 of DDXl cDNA (Fig. LB) into £coRI-digested 
pMAL-c2 expression vector (New England Biolabs). DH5a cells trans- 
formed with this vector were grown to mid-log phase and induced with 
0.1 mM isopropyl-l-thio-0-D-thiogaIactoside. The cells were harvested 

3- 4 h postinduction and lysed by sonication. Soluble maltose binding 
protein-DDXl fusion protein was affinity -purified using amylose resin, 
and the maltose-binding protein was cleaved with factor Xa. The DDXl 
protein was purified on a SDS-PAGE gel, electroeluted, and concen- 
trated. Approximately 100 fig of protein was injected into rabbits at 

4- 6-week intervals. For the initial injection, the protein was dispersed 
in complete Freund's adjuvant (Sigma), while subsequent injections 
were prepared in Freund's incomplete adjuvant. Blood was collected 
from each rabbit 10 days after injection, and the specificity of the 
antiserum was tested using cell extracts from RB522A. To prepare 
antiserum to the N terminus of DDXl protein, a DDXl cDNA fragment 
from bp 268 to 851 (Fig. IB) was inserted into pGEX-4T2 (Amersham 
Pharmacia Biotech). The recombinant protein produced from this con- 
struct contains the first 186 amino acids of the predicted DDXl se- 
quence. Soluble glutathione 5-transferase-DDXl fusion protein was 
purified with glutathione-Sepharose 4B (Amersham Pharmacia Bio- 
tech). The glutathione S-transferase component of the fusion protein 
was cleaved with thrombin. 

Subcellular Fractionations and Western Blot Analysis — We used two 
different procedures for subcellular fractionations. First, we isolated 
nuclear and S100 (soluble cytoplasmic) fractions from RB522A, IMR-32, 
Y79, RB(E)-2, HeLa, and HL60 using the procedure of Dignam (42). On 
average, we obtained 5-6 times more protein in the cytosolic fractions 
than in the nuclear fractions. Second, 10 8 RB522A cells were lysed and 
fractionated into S4 (soluble cytoplasmic components), P2 (heavy mito- 
chondria, plasma membrane fragments), P3 (mitochondria, lysozymes, 
peroxisomes, and Golgi membranes), and P4 fractions (membrane ves- 
icles from rough and smooth endoplasmic reticulum, Golgi, and plasma 
membrane) by differential centrifugation (43). We obtained 8 mg of 
protein in the S4 fraction, 1 mg in P2, 0.5 mg in P3, and 2 mg in P4 
fraction. The procedures related to, the immunoelectron microscopy 
have been previously described (44). 

For Western blot analysis, proteins were electrophoresed in poly- 
acrylamide-SDS gels and electroblotted onto nitrocellulose using the 
standard protocol for protein transfer described by Schleicher and 
Schuell. The filters were incubated with a 1:5000 dilution of DDXl 
antiserum, a 1:200 dilution of anti-MYCN monoclonal antibody (Boeh- 
ringer Mannheim), or a 1:200 dilution of anti-actin (Santa Cruz Bio- 
technology, Inc., Santa Cruz, CA). For the colorimetric analysis, anti- 
gen-antibody interactions were visualized using either alkaline 
phosphatase-linked goat anti-rabbit IgG (for DDXl) or goat anti-mouse 
IgG (for MYCN) at a 1:3000 dilution. For the ECL Western blotting 
analysis (Amersham Pharmacia Biotech), we used a 1:100,000 dilution 
of peroxidase-linked secondary anti-rabbit IgG antibody (for DDXl) or 
secondary anti-goat IgG antibody (Jackson ImmunoResearch 
Laboratories). 

RESULTS 

Identification of the 5' -End of the DDXl Transcript — We 
have previously reported the sequence of DDXl cDNA isolated 
from an RB cDNA library (21, 23). This 2.4-kb DDXl cDNA 
contains an open reading frame spanning positions 1-2201 
with a methionine encoded by the first three nucleotides (Fig. 
LA). There is a polyadenylation signal and poly(A) tail in the 
3 '-untranslated region, indicating that the sequence is com- 
plete at the 3'-end. Manohar et al. (37) have also isolated DDXl 
cDNA from the NB cell line LA-N-5. Their cDNA extended the 
5' -end of our sequence by 42 bp and included an additional in 
frame methionine {double underlined in Fig. LA). The possibil- 
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Fig. 1. Partial sequence and structure of DDXl cDNA. A, the 

sequence of the 5'-end of DDXl cDNA. The sequence in boldface type 
starting at the asterisk was obtained using the RACE strategy. The 
additional 6 bp in italic boldface type at the 5' -end of the cDNA are 
predicted based on the known DDXl genomic sequence and primer 
extension analysis. Pi, P2, P3, and P4 are primers used in the RACE 
experiments (the complementary sequence was used in each case). 
Primers P3 and P4 were also used for the primer extension analysis. 
Three in frame methionine codons are indicated by the double under- 
line. An in frame stop codon is indicated by the boldface double under- 
line. The three major transcription initiation sites identified by primer 
extension are indicated by the single arrows, while a minor site is 
represented by the broad arrow. The predicted DDX1 transcription 
initiation sites obtained by RACE, SI nuclease, and primer extension 
are indicated as well as the 5 '-ends of DDX1 cDNA sequences obtained 
by screening cDNA libraries. The sequences transcribed from exons 1, 
2, and 3 are also shown. B, the structure of the 2711-bp DDX1 cDNA is 
shown with an open reading frame from position 295 to 2515. 

ity of additional in frame methionines located further upstream 
could not be excluded, because there were no predicted stop 
codons in the upstream region of the cDNA. 

Northern blot analysis indicated a DDX1 transcript size of 
-2800 nt, suggesting that the DDX1 cDNAs isolated to date 
were lacking -300-350 bp of 5' sequence. We have used dif- 
ferent approaches to identify the transcription start site of 
DDX1. First, we exhaustively screened a commercial fetal 
brain cDNA library with the 5'-end oiDDXl cDNA. Although 
numerous clones were analyzed, only one extended the se- 
quence (by 35 bp) beyond that published by Manohar et al. (37) 
(Fig. U). 

We next used the RACE procedure in an attempt to isolate 
additional 5' sequence. The nested primers used to amplify the 
5'-end of the DDXl transcript are labeled as primers PI and P2 
in Fig. 1A and are located downstream of the three in frame 
methionines (double underlined in Fig. XA). Poly(A) + RNA 
from RB522A was reverse transcribed at 52 °C using primer 
PI, and the reverse transcribed cDNA was amplified using the 
nested primer P2 and the 5 '-RACE primer. Using this ap- 
proach, we generated a product that was 230 bp longer than 
any of the cDNAs obtained by screening libraries (Fig. 1A). 
Sequencing of this 230-bp cDNA revealed an in frame stop 
codon (boldface double underline in Fig. L4) located 123 bp 




Fig. 2. Identification of the 5'-end of the DDXl transcript by 
primer extension. Radioactively labeled primer P4 was annealed to 2 
fig of po!y(A) + RNA from RB522A (lane 1), 1 M g of poly(A) + RNA from 
RB522A (lane 2), and 2 /xg of poly(A) + RNA from RB(E)-2 cells (lane 3), 
and extended using reverse transcriptase. The products were run on an 
8% denaturing polyacrylamide gel with a G + A sequencing ladder as 
size marker. The primer extension products are indicated on the left. 
The sizes of the products (in nt) are presented as the distance from 
primer P4. 

upstream of the predicted translation initiation site. We then 
prepared primers P3 and P4, located near the 5 '-end of the 
RACE cDNA (Fig. 1A) and repeated the RACE procedure to see 
if additional 5' sequences could be obtained. The resulting 
RACE products did not extend the * DDXl cDNA sequence 
further. 

The location of the DDXl transcription initiation site was 
verified by primer extension. Poly(A) + RNA was prepared from 
the following two cell lines: DDXl -amplified RB cell line 
RB522A and a nonamplified RB cell line RB(E)-2. RB522A has 
elevated levels of DDXl mRNA, while RB(E)-2 has at least 
20-fold lower levels of this transcript. Three products of 40, 43, 
and 46 nt (with a weak signal at 45 nt) were detected in 
RB522A using primer P4 (Figs. IA and 2). The 40-nt product 
corresponded exactly with the 5 '-end of the RACE-derived 
cDNA while the 43- and 46-nt products extended the predicted 
size of the DDXl transcript by 3 and 6 nt, respectively. None of 
these products were observed in RB(E)-2. Bands of identical 
sizes to those obtained with RB522A mRNA were also observed 
in the DDXl -amplified NB cell line BE(2)-C but not in the 
£>£X;-amplified NB. cell line IMR-32 (data not shown). The 
same predicted DDXl transcription initiation site was identi- 
fied with primer P3 except that the bands were of weaker 
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Fig. 3. Genomic map of the 5-end of DDXl. The exons are represented by the black boxes, and distances are in kilobase pairs. The locations 
of EcoRl (E) sites are indicated. 



intensity (data not shown). We have designated the transcrip- 
tion start site identified by primer extension as + 1 (Fig. 1A). 

The sequence of the 6 nt extending beyond the RACE cDNA 
was obtained by comparison of the cDNA sequence with that of 
DDXl genomic DNA. Bacteriophages containing DDXl 
'genomic DNA were isolated by screening a human placenta 
library with 5' DDXl cDNA. Eighteen kb of DNA were se- 
quenced from two bacteriophages with overlapping DDXl 
'genomic DNA. Thirteen exons were identified within this 18-kb 
region (Fig. 3) corresponding to cDNA sequences from position 

1 to 1249. The 310-bp exon 1 was by far the longest of the 13 
exons sequenced, corresponding to the entire 5'-untranslated 
region of DDXl as well as the first in frame methionine. The 
sequences transcribed from exons 1, 2, and 3 are indicated in 
Fig. 1A. 

Knowledge of the genomic structure of DDXl allowed us to 
use the SI protection assay, a technique that is independent of 
reverse transcriptase, to further define the 5 '-end of the DDXl 
transcript. Poly(A) + RNAs from six DDXl -amplified lines (RB 
lines: Y79 and RB522A; NB lines: BE(2)-C, IMR-32, LA-N-1, 
and LA-N-5) and six nonamplified lines (RB lines: RB(E)-2 and 
RB412; NB lines, GOTO, NB-1, NUB-7, and SK-N-MC) were 
hybridized to a DNA probe that extended from position -745 in 
the 5'-flanking DDXl DNA to position +164 in exon 1. This 
DNA probe was labeled at position + 164 as indicated in Fig. 4. 
Nonhybridized DNA was digested with SI nuclease, and the 
sizes of the protected fragments were analyzed on a denaturing 
polyacrylamide gel. Bands of 150-153 nt were observed in lane 

2 (RB522A), lane 5 (BE(2)-C), and lane 8 (LA-N-1) with bands 
of much weaker intensity in lane 7 (IMR-32) (Fig. 4). Specific 
bands were not detected in either DDXl -amplified Y79 and 
LA-N-5 or the nonamplified lines. Although the sizes of the Si 
protected bands in RB522A, BE(2)-C, and LA-N-1 were 5 and 
11 nt shorter than predicted based on RACE and primer ex- 
tensions, respectively, there was general agreement with all 
three techniques regarding the location of the DDXl transcrip- 
tion initiation site (Fig. 1A). The smaller SI nuclease protected 
products could have arisen as the result of SI digestion of the 
5'-end of the RNA:DNA heteroduplex because of its relatively 
high rU:dA content (45). 

Identification of the same transcription initiation site in 
three DDXl -amplified lines suggests that this represents the 
bona fide start site of DDXl transcription. However, it was not 
clear why this start site was either very weak or not detected in 
three other amplified lines. To determine whether the 5 '-end of 
exon 1 is transcribed in all DDXl -amplified lines, we carried 
out a direct analysis of the 5 '-end of the DDXl transcript by 
Northern blotting. Two probes were used for this analysis: the 
5' probe contained a 160-bp fragment from bp 1 to 160 (5 '-half 
of exon 1), and the 3' probe contained a 260-bp fragment from 
bp 160 to 420 (3'-half of exon 1 as well exons 2 and 3) (Fig. 1A). 
With the 3' probe, we obtained bands of similar size and inten- 
sity in four M)A7-amplified lines (RB522A, BE(2)-C, IMR-32, 
and LA-N-5). Band intensity was somewhat weaker in Y79 and 
stronger in LA-N-1 in comparison with the other lines (Fig. 5). 
No signal was detected in the non-DDXl -amplified line RB412. 
With the 5' probe, a relatively strong signal was observed in 
RB522A, BE(2)-C, and LA-N-1, while a considerably weaker 



but readily apparent signal was detected in Y79, IMR-32, and 
LA-N-5. The signal obtained with actin indicates that, with the 
exception of LA-N-1, similar amounts of RNA were loaded in 
each lane and that the RNA was not degraded. These results 
indicate that at least a portion of the 160-bp 5'-end of exon 1 is 
transcribed in all DDXl -amplified lines. 

Based on primer extension, SI nuclease protection assay, 
Northern blot analysis and the sequencing of the RACE prod- 
ucts, we conclude that the DDXl transcript is 2.7 kb with an 
open reading frame spanning nucleotides 295-2515 encoding a 
predicted protein of 740 amino acids with an estimated molec- 
ular weight of 82.4 (Fig. IB). An in frame stop codon is located 
123 nt upstream of the predicted translation initiation site, at 
positions 172-174. The first in frame methionine following the 
stop codon is in agreement with the Kozak consensus sequence 
(46). Furthermore, the predicted start methionine codon for 
human DDXl corresponds perfectly with that of Drosophila 
DDXl (47). A stop codon is located 15 ht upstream of the 
initiation codon in Drosophila DDXl. 

Analysis of DDXl Protein Levels in Neuroblastoma and Ret- 
inoblastoma— We and others have previously shown that there 
is a good correlation between gene copy number and RNA levels 
in DDXl-amphfied RB and NB cell lines (37, 38). To determine 
whether the correlation extends to DDXl protein levels, we 
prepared antiserum to two nonoverlapping recombinant DDXl 
proteins. First, we prepared a C terminus recombinant protein 
construct by inserting a 1.8-kb J£coRI fragment from bp 848 to 
2668 (amino acids 185-740) (Fig. IB) into the pMAL-c2 expres- 
sion vector. Recombinant protein expression was induced with 
isopropyl-l-thio-j3-D-thiogalactoside, and the 110-kDa maltose- 
binding protein-DDXl fusion product was purified by affinity 
chromatography using amylose resin, followed by electrophore- 
sis on a SDS-PAGE gel after cleaving the maltose-binding 
protein fusion partner with factor Xa. Second, we prepared an 
N terminus construct by ligating a DNA fragment from bp 268 
to 851 (amino acids 1-186) into pGEX-4T2. The 50-kDa gluta- 
thione S-transferase-DDXl fusion protein was purified by af- 
finity chromatography on a glutathione column. This N termi- 
nus fusion protein contains only the first of the eight conserved 
motifs found in all DEAD box proteins, while the C terminus 
fusion protein includes the remaining seven motifs. 

We measured DDXl protein levels in total cell extracts of 
three RB and 10 NB cell lines. Using antiserum to the N 
terminus fusion protein, we observed a strong signal in all 
Z)£>X7-amplified cell lines: the RB cell lines Y79 {lane 1) and 
RB522A (lane 2) and the NB cell lines BE(2)-C (lane 4), IMR-32 
(lane 6), LA-N-1 (lane 8), and LA-N-5 (lane 9) (Fig. 6). Two 
, bands were observed in the majority of extracts. Of the ampli- 
fied lines, Y79 produced the weakest signal, with the most 
intense signal observed in LA-N-1. There was an excellent 
correlation with DDXl protein and mRNA levels in these cell 
lines, with lower levels of DDXl mRNA observed in Y79 and 
higher levels in LA-N-1 (Fig. 7A). As shown in Fig. IB, this 
correlation extended to DDXl gene copy number. No gross 
DNA rearrangements were seen in the DDXi -amplified lines; 
however, three small bands of altered size were observed in the 
RB412 lane. Although the nature of the DNA alteration is not 
known, it is noteworthy that DDXl transcript levels in RB412 
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Fig. 4. SI nuclease mapping of the 
5' -end of the DDXl transcript. Two /ig 

of poly(A) + RNA from four RB lines 
(ODXr-amp.lified Y79 and RB522A and 
nonamplified RB(E)-2 and RB412), eight 
NB lines (DDXl -amplified BE(2)-C, IMR- 
32, LA-N-1, and LA-N-5 and nonamplified 
GOTO, NB-1, NUB-7, and SK-N-MC), 
and tRNA as a negative control were hy- 
bridized to a Sphl-Aval fragment labeled 
at the Aval site with [-y- 32 P]ATP and 
polynucleotide kinase. Bands of 150-153 
nt are shown in lanes 2 (RB522A), 5 
(BE(2)-C), and 8 (LA-N-1) with much 
weaker bands in lane 7 (IMR-32). A map 
of the probe indicating the transcription 
initiation site identified by primer exten- 
sion ( + 1), the labeling site (*), and exons 
1 and 2, is shown at the bottom. 
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are extremely low (Fig. 1A) and that the top DDX1 protein band 
in RB412 cell extracts is smaller in size than the top band from 
the other cell extracts (Fig. 6). 

Two DDX1 protein bands were present in most of the lanes in 
Fig. 6. The same two bands were detected with antiserum to 
the C terminus of the DDX1 protein, as well as a third band at 
—60 kDa (data not shown). There was no variation in the 
intensity of the 60-kDa band in DDXl -amplified and nonam- 
plified cell extracts. The 60-kDa band probably represents an- 
other member of the DEAD box protein family, because the C 
terminus DDXl protein used to prepare this antiserum con- 
tained seven of the eight conserved motifs found in all DEAD 
box proteins. To obtain an estimate of the size of the two DDXl 
bands, we ran cellular extracts from RB522A on a 7% SDS- 
PAGE gel with the BenchMark protein ladder (Life Technolo- 
gies, Inc.). The size of the DDXl protein was determined using 
the Alpha Imager 2000 documentation and analysis system for 
molecular weight calculation. Based on this analysis, the esti- 
mated molecular mass of the top band is 89.5 kDa, while that 
of the bottom band is 83.5 kDa. The 84-kDa band may repre- 
sent the unmodified product encoded by the DDXl transcript 
(capable of encoding a protein with a predicted molecular mass 
of 82.4 kDa), while the top band may represent post-transla- 
tional modification of DDXl protein (e.g. phosphorylation). An- 
other possibility is that the top band represents intact DDXl 
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Fig. 5. Northern blot analysis of the 5'-end of the DDXl tran- 
script. Two fxg of po!y(A) + RNA isolated from DDXl -amplified Y79, 
RB522A, BE(2)-C, IMR-32, LA-N-1, and LA-N-5 and nonamplified 
RB412 were electrophoresed in a 1.5% agarose-formaldehyde gel. The 
RNA was transferred to a nitrocellulose filter and sequentially hybrid- 
ized with a 260-bp fragment from DDXl cDNA from bp +160 to +420 
(3 '-end of exon 1 as well as exons 2 and 3) (A), a 160- bp fragment from 
DDXl cDNA from bp +1 to + 160 (5'-end of exon 1) (B), and actin cDNA 
(C). The DNA was labeled with | 32 P)dCTP by nick translation. The blots 
were hybridized and washed under high stringency. 
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Fig. 6. DDX1 protein expression in RB and NB cell lines. West- 
ern blots were prepared using total cellular extracts from three RB 
(Y79, RB522A, and RB412) and 10 NB cell lines (BE(2)-C, GOTO, 
IMR-32, KAN, LA-N-1, LA-N-5, NB-1, NUB-7, SK-N-MC, and SK-N- 
SH). The lines that are amplified for the DDXl gene are Y79, RB522A, 
BE(2)-C, IMR-32, LA-N-1, and. LA-N-5. Twenty fig of protein were 
loaded in each lane and electrophoresed in a 10% SDS-PAGE gel. DDXl 
was detected using a 1:5000 dilution of the antiserum to the amino 
terminus of DDX1 protein. Size markers in kilodaltons are indicated on 
the side. 
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Fig. 8. Distribution of DDX1 in the nucleus and cytoplasm. A, 

cytosolic and nuclear extracts were prepared from RB522A and electro- 
phoresed in a 7% SDS-PAGE gel. Cytosolic extracts were loaded in 
lanes 1 (20 fig of protein) and 2 (10 ^g), while nuclear extracts were 
loaded in lanes 3 (10 /ig) and 4 (20 ^g). DDX1 was visualized using a 
1:5000 dilution of. the antiserum to the N terminus. The BehchMark 
protein ladder size markers (kilodaltons) are indicated on the left, B, 
cytosolic and nuclear extracts were prepared from HL60, Y79, IMR-32, 
HeLa, RB522A, and RB(E)-2 and electrophoresed in an 8% SDS-PAGE 
gel. Twenty fig of proteins were loaded in each lane marked C (cytosolic) 
and N (nuclear). DDX1 was visualized using a 1:5000 dilution of the 
antiserum to the N terminus. Actin levels were analyzed using a 1:200 
dilution of anti-actin antibody (Santa Cruz Biotechnology). 
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Fig. 7. Northern and Southern blot analyses of DDXl in RB 
and NB cell lines. A, 2 jig of poly(A) + RNA were loaded in each lane, 
electrophoresed in a 1.5% agarose-formaldehyde gel, and transferred to 
a nitrocellulose filter. The filter was first hybridized to a 32 P-labeled 
1.6-kb DDX1 cDNA (clone 1042) (21), stripped, and rehybridized to 
actin DNA. B, 10 \ig of genomic DNA from each of the indicated cell 
lines were digested with EcoRl, electrophoresed in a 1% agarose gel, 
and transferred to a nitrocellulose filter. The filter was hybridized to 
32 P-Iabeled clone 1042 DDXl cDNA, stripped, and reprobed with la- 
beled a-fetoprotein cDNA. Markers (in kilobase pairs) are indicated on 
the side. 



and the lower band is a specific truncated or degradation prod- 
uct of DDX1. Yet a third possibility is that the two bands 
represent the products of differentially spliced transcripts or 



different translation initiation sites. However, the lack of any 
obvious differences in DDX1 transcript sizes in the three RB 
and 10 NB lines analyzed in Fig. 7A does not support the latter 
possibility (e.g. compare the DDXl transcript size in NUB-7 
(which produces the lower DDXl protein band) and in NB-1 
(which produces the higher DDXl protein band)). 

Subcellular Localization of DDXl Protein— BEAD box pro- 
teins have been implicated in a variety of cellular functions 
including RNA splicing in the nucleus, translation initiation in 
the cytoplasm, and ribosome assembly in the nucleolus. To 
obtain an indication of the possible role of DDXl, we studied its 
subcellular location. Nuclear and cytosolic extracts were pre- 
pared from DDXl -amplified RB522A and run on a 7% SDS- 
PAGE gel. Although there was more DDXl protein in the 
cytosol than in the nucleus on a per cell basis, the proportion of 
DDXl protein relative to total protein was similar in both 
cellular compartments (Fig. SA). Both the 90- and 84-kDa 
bands were present in cytosol and nuclear extracts, although 
the bottom band was more readily apparent in the cytosol. By 
running the gel for an extended period of time (twice as long as 
usual), we were able to detect an additional weak band at —88 
kDa in both nuclear and cytosolic extracts. 

To determine whether DDXl consistently localizes to both 
the cytoplasm and nucleus, we prepared cytosol and nuclear 
extracts from two additional DDXl -amplified lines, Y79 and 
IMR-32, as well as from nonamplified RB(E)-2, HL60, and 
HeLa. DDXl protein was found in both the nucleus and cyto- 
plasm of IMR-32, primarily in the cytoplasm of Y79, and 
mainly in the nucleus of the three nonamplified lines (Fig. SB). 
In addition, DDXl was almost exclusively found in nuclear 
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Fig. 9. Subcellular location of DDX1 protein. RB522A cells were 
fractionated into nuclear (lane 1), S100 and S4 cytosol (lanes 2 and 3), 
P2 membrane (lane 4\ P3 membrane (lane 5), and P4 membrane (lane 
6) fractions. Twenty of protein were loaded in each lane and run on 
a 10% SDS-PAGE gel. A, DDX1 protein was detected using a 1:5000 
dilution of the antiserum to the N terminus of DDX1. B, MYCN protein 
was detected using a commercially available antibody at a 1:200 dilu- 
tion. Size markers (kilodaltons) are indicated on the side. 

extracts prepared from normal GM38 fibroblasts (data not 
shown). We used anti-actin antibody to ensure that our nuclear 
and cytosolic extracts were not cross-contaminated (Fig. SB). 

We next carried out a more detailed analysis of DDX1 sub- 
cellular location using two different approaches: (i) fraction- 
ation of cellular components into nuclei; S100 or S4 cytosol 
(containing soluble cytoplasmic components, including 40 S 
ribosomes); P2 (heavy mitochondria, plasma membrane frag- 
ments plus material trapped by these membranes); P3 (mito- 
chondria, lysosomes, peroxisomes, Golgi membranes, some 
rough endoplasmic reticulum); and P4 (microsomes from 
smooth and rough endoplasmic reticulum, Golgi and plasma 
membranes) (43); and (ii) immunogold electron microscopy. 
The DDXl -amplified RB522A cell line was used for both exper- 
iments. The fractionation procedures indicate that DDX1 is 
mainly in the nucleus and in the cytosol (S4 and S100 fractions) 
of RB522A cells (Fig. 9A). As a control, we used anti-human 
MYCN antibody to determine the location of MYCN (also am- 
plified in RB522A) in our subcellular fractions. As shown in 
Fig. 9£, MYCN was primarily found in the nucleus, as one 
would expect of a transcription factor. 

For the electron microscopy analysis, antiserum to the N 
terminus of DDX1 was coupled to protein A gold particles, and 
the distribution of DDX1 was examined in RB522A cells fixed 
in paraformaldehyde and glutaraldehyde. DDX1 was present 
in both the cytoplasm and nucleus (data not shown). There was 
no association with either cell organelles or with nuclear or 
plasma membranes. 

DISCUSSION 

There are presently few clues as to the function of DDX1 in 
normal and cancer cells. Our earlier data indicate that DDXl 
mRNA is present at higher levels in fetal tissues of neural 
origin (retina and brain) compared with other fetal tissues (21). 



There may therefore be a requirement for elevated levels of this 
putative RN A helicase for the efficient production or processing 
of neural specific transcripts. A role in cancer formation or 
progression is an intriguing possibility, because overexpression 
of an RNA unwinding protein could affect the secondary struc- 
ture of RNAs in such a way as to alter the expression of specific 
proteins in tumor cells. DDXl is co-amplified with MYCN in a 
subset of RB and NB cell lines and tumors (37-39). MYCN 
amplification is common in stage IV NB tumors and is a well 
documented indicator of poor prognosis. A general trend to- 
ward a poorer clinical prognosis is observed when both the 
MYCN and DDXl genes are amplified compared with when 
only MYCN is amplified (38, 39), suggesting a possible role for 
DDXl in NB tumor formation or progression. 

It is generally accepted that co-amplified genes are not over- 
expressed unless they provide a selective growth advantage to 
the cell (48, 49). For example, although ERBA is closely linked 
to ERBB2 in breast cancer and both genes are commonly am- 
plified in these tumors, ERBA is not overexpressed (48). Sim- 
ilarly, three genes mapping to 12ql3-14 (CDK4, SAS, and 
MDM2) are overexpressed in a high percentage of malignant 
gliomas showing amplification of this chromosomal region, 
while other genes mapping to this region (G ADD 153, GLI y and 
A2MR) are rarely overexpressed in gene-amplified malignant 
gliomas (50, 51). The first three genes are probably the main 
targets of the amplification process, while the latter three 
genes are probably incidentally included in the amplicons. The 
data shown here indicate that DDXl is overexpressed at both 
the protein and RNA levels in DDXl -amplified RB and NB cell 
lines and that there is a strong correlation between DDXl gene 
copy number, DDXl RNA levels, and DDXl protein levels in 
these lines. Our results are therefore consistent with DDXl 
overexpression playing a positive role in some aspect of NB and 
RB tumor formation or progression. Recently, Weiss et al. (52) 
have shown that transgenic mice that overexpress MYCN de- 
velop NB tumors several months after birth. They conclude 
that MYCN overexpression can contribute to the initiation of 
tumorigenesis but that additional events are required for tu- 
mor formation. Amplification of DDXl may represent one of 
many alternative pathways by which a normal precursor "neu- 
roblast" or "retinoblast" cell gains malignant properties. 

The function of the majority of tissue-specific or developmen- 
tal^ regulated DEAD box genes remains unknown. However, 
some members of this protein family have been either directly 
or indirectly implicated in tumorigenesis. For example, the p68 
gene has been found to be mutated in the ultraviolet light- 
induced murine tumor 8101 (53), while DDX6 (also'known as 
RCK or p54) is encoded by a gene located at the breakpoint of 
the translocation involving chromosomes 11 and 14 in a cell 
line derived from a B-cell lymphoma (54, 55). Similarly, the 
production of a chimeric protein between DDX10 and the 
nucleoporin gene NUP98 has been proposed to be involved in 
the pathogenesis of a subset of myeloid malignancies with 
inv(ll) (pl5q22) (56). Interestingly, Grandori et al. (57) have 
shown that MYCC interacts with a DEAD box gene called 
MrDb, suggesting that the transcription of some DEAD box 
genes could be regulated through interaction with members of 
the MYC family. Future work will involve determining whether 
DDXl represents another member of the DEAD box family 
with a role in the tumorigenic process. 

DEAD box proteins have been implicated in translation ini- 
tiation, RNA splicing, RNA degradation, and RNA stability (3, 
18, 19). We carried out subcellular localization studies in an 
attempt to obtain a general indication of the function of DDXl. 
We found DDXl protein in both the cytoplasm and nucleus of 
DDXJ-amplified NB and RB lines. In contrast, DDXl was 
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mainly located in the nucleus of nonamplified cell lines and 
normal fibroblast cultures. DDX1 was not associated with cel- 
lular organelles or with membranes based on immunoelectron 
microscopy. We therefore propose that the primary role of 
DDX1 is in the nucleus. The presence of DDX1 in the cytoplasm 
of Z)Z)XJ-amplified cells may indicate that the amount of DDX1 
protein that is allowed in the nucleus is tightly regulated. 
Alternatively, DDX1 may play a dual role in the nucleus and 
cytoplasm of DDX1 -amplified cells. 

An important component of our analysis was to identify the 
translation and transcription initiation sites of DDX1. We used 
a combination of techniques to identify the transcription start 
site: screening of RB and fetal brain libraries, RACE, primer 
extension, genomic DNA sequencing, SI nuclease mapping, 
and Northern blot analysis using probes to the predicted 5 '-end 
of the transcript. The transcription start site identified using 
these techniques is located -300 nt upstream of the predicted 
translation initiation codon and was readily detected in three 
DDX1 -amplified lines and barely detectable in a fourth ampli- 
fied line. The 5'-untranslated region as well as the first in 
frame methionine are encoded within the first exon of DDX1. 
An in frame stop codon is located 123 nt upstream of the 
predicted initiation codon. We were unable to identify the tran- 
scription initiation site of DDX1 in two of the six amplified lines 
tested as well as in. nonamplified lines. Although it remains 
possible that there are different transcription start sites in 
different cell lines, detection of lower levels (rather than the 
absence) of the 5'-most 160 nt of the DDX1 transcript in IMR- 
32, Y79, and LA-N-5 compared with RB522A, BE(2)-C, and 
LA-N-1 supports a quantitative rather than a qualitative dif- 
ference in the 5 '-end of this transcript in these cells. Our 
results suggest that the 5'-end of DDX1 mRNA is rarely intact, 
even in mRNA preparations that otherwise appear to be of high 
quality based on analysis of control transcripts. The 5' -end of 
DDX1 mRNA may therefore be especially susceptible to degra- 
dation, perhaps because of its sequence and/or secondary 
structure. 

In conclusion, we have mapped the 5 '-end of the 2.7-kb DDX1 
transcript and have identified the predicted translation initia- 
tion site of DDX1 protein. We have found that DDX1 -amplified 
RB and NB tumor lines overexpress DDX1 protein and that 
there is a good correlation between gene copy number and both 
transcript and protein levels in these cells. We have shown that 
DDX1 protein is primarily located in the nucleus of cells that 
are not DDX1 -amplified. In contrast, DDX1 is present in both 
the nucleus and cytoplasm of DDX1- amplified NB and RB 
lines. A cytoplasmic location in £>£>Xi-amplified lines may in- 
dicate that the amount of nuclear DDX1 is tightly regulated or 
that DDX1 plays a dual role in the cytoplasm and nucleus of 
these cells. 
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Abstract 

The BMI-1 gene is a putative oncogene belonging to the Polycomb 
group family that cooperates with z-myc in the generation of mouse 
lymphomas and seems to participate in cell cycle regulation and senes- 
cence by acting as a transcriptional repressor of the INK4a/ARF locus. 
The BMI-1 gene has been located on chromosome 10pl3, a region involved 
in chromosomal translocations in infant leukemias, and amplified in 
occasional non-Hodgkin's lymphomas (NHLs) and solid tumors. To de- 
termine the possible alterations of this gene in human malignancies, we 
have examined 160 lymphoproliferative disorders, 13 myeloid leukemias, 
and 89 carcinomas by Southern blot analysis and detected BMI-1 gene 
amplification (3- to 7-fold) in 4 of 36 (11%) mantle cell lymphomas 
(MCLs) with no alterations in the INK4a/ARF locus. BMI-1 and pl6 ,NK4a 
inRNA and protein expression were also studied by real-time quantitative 
reverse transcription-PCR and Western blot, respectively, in a subset of 
NHLs. BMI-1 expression was significantly higher in chronic lymphocytic 
leukemia and MCL than in follicular lymphoma and large B cell lym- 
phoma. The four tumors with gene amplification showed significantly 
higher mRNA levels than other MCLs and NHLs with the BMI-1 gene in 
germline configuration. Five additional MCLs also showed very high 
mRNA levels without gene amplification. A good correlation between 
BMI-1 mRNA levels and protein expression was observed in all types of 
lymphomas. No relationship was detected between BMI-1 and pl6 ,rvK4a 
mRNA levels. These findings suggest that BMl-l gene alterations in 
human neoplasms are uncommon, but they may contribute to the patho- 
genesis in a subset of malignant lymphomas, particularly of mantle cell 

.type- 

Introduction 

The BMI~l y gene is a putative oncogene of the Polycomb group 
originally identified by retroviral insertional mutagenesis in E/x-c- 
myc transgenic mice infected with the Moloney murine leukemia 
vims (1, 2). These animals had a rapid development of pre-B cell 
lymphomas showing frequent proviral insertions near the BMI-1 gene. 
This integration resulted in BMI-1 overexpression suggesting a coop- 
erative effect between C-MYC and BMI-1 genes in the development of 
these. tumors (3, 4). Recent studies have indicated that the BMI-l gene 
may also participate in cell cycle control and senescence through the 



INK4a/ARF locus by acting as an upstream negative regulator of 



pl6 1! 



and p!4/pl9 ARF gene expression (5). The! human BMl-J 
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gene has been mapped to chromosome 1 Op 1 3 (6), a region involved in 
chromosomal translocations in infant leukemias (7) and rearrange- 
ments in malignant T cell lymphomas (8, 9). More recently, high-level 
DNA amplifications, of this region have been found by comparative 
genomic hybridization in NHLs and solid tumors (10, 11). However, 
the possible implication of the BMJ-1 gene in these alterations and its 
role in the pathogenesis of human tumors is not known. The aim of 
this study was to analyze the possible BMI-1 gene alterations and 
expression in a large series of human neoplasms and to determine the 
relationship with INK4a/ARF locus aberrations. 

Materials and Methods 

Case Selection. A series of 262 human tumors, including 173 hematolog- 
ical malignancies and 89 carcinomas (Table I), matched normal tissues from 
all carcinomas, 1 1 samples of normal peripheral mononuclear cells, and 5 
reactive lymph nodes and tonsils, were selected based on the availability of 
frozen samples for molecular analysis. 

DNA Extraction and Southern Blot Analysis. Genomic DNA was ob- 
tained using Proteinase K/RNase treatment. 15 /ag were digested with EcoK\ 
and ///mill I restriction enzymes (Life Technologies. Inc.. Gaithersburg, MD). 
for Southern blot analysis and hybridized with a 1.5-kb Pst\ fragment of the 
partial BMI-1 cDNA (6). 

RNA Extraction and Real-time Quantitative RT-PCR. Total RNA was 
obtained from 67 lymphoid neoplasms (10 CLLs, 27 MCLs, 8 FLs, and 22 
LCLs) using guanidine/isothiocyanate extraction and cesium/chloride gradient 
cenlrifugation. One jag of total RNA was transcribed into cDNA using 
MM LV-re verse transcriptase (Life Technologies, Inc.) and random hexamers, 
following manufacturer's directions. Sequences of the BMI-1 and the pl6 
detection probes and primers were designed using the Primer Express program 
(Applied Biosystems, Foster City) as follows: BMI-1 sense, V-CTGGTTGC- 
CCATTGACAGC-3'; BMI-1 aniisense. 5'-CACAAAATCAATGCCAC- 
CCA-3': p!6 sense. 5'-CAACGCACCGAATAGTTACGG-3': pl6 ant i sense. 
5 ' - A A CTTC G TC CT C C A G A G TC G C - 3 \ The probes BMI-1, 5'-C'AGCTC- 
G C T T C AAGATGG C CG C -3 ' , and p!6. 5'-CGGAGGCCGATCCAGCTGG- 
GTA-3', were labeled with 6-carboxy-fluorescein as the reporter dye. The 
TaqMan-GAPDH Control Reagents (Applied Biosystems) were used to am- 
plify and detect the GAPDH gene, as recommended by the manufacturer. The 
quantitative assay amplified 1 /ami of cDNA in two to four replicates using the 
primers and probes described above and the standard master mix (Applied 
Biosystems). All reactions were performed in an ABI PRISM 7700 Sequence 
Detector System (Applied Biosystems). GAPDH, BMl-l, and pl6 ,NK4n ex- 
pression was related to a standard curve derived from serial dilutions of Raji 
cDNA. The RUs of BMI-1 and pl6 ,NK4a expression were defined as the 
mRNA levels of these genes normalized to the GADPli expression level in 
each case. 

Protein Analysis. Whole-cell protein extracts were obtained from addi- 
tional frozen tissue available in 31 cases (7 CLLs. 12 MCLs, 8 FLs, and 4 
LCLs). loaded onto a 10% SDS-polyacrylamide gel. and electroblottcd to a 
nitrocellulose membrane (Amersham). Blocked membranes were incubated 
sequentially with the monoclonal antibody BM1-F6 (12).. anlimouse conju- 
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Table 1 Hematological malignancies and solid tumor samples analyzed for BMf-1 
gene alterations 



Tissue samples 


[NO. 01 C3SCS 


Hematological malignancies 




Hodgkin's disease 


2 


B cell lymphoproliferative disorders 




B-Acute lymphoblastic leukemia 


14 


CLL 


29 


Hairy cell leukemia 


4 


FL 


15 


MCL 


36 


LCL 


40 


T cell lymphoproliferative disorders 




T-Actitc lymphoblastic leukemia 


8 


Large granular cell leukemia 


4 


Peripheral T-cel! lymphoma 


8 


Myeloproliferative disorders 




Acute myeloid leukemia 


7 


Chronic myeloid leukemia 


6 


Solid tumors 




Colon carcinoma 


26 


• Breast carcinoma 


29 


Laryngeal squamous cell carcinoma 


34 ' 


Total 


262 



gated to horseradish peroxidase (Amersham), and detected by enhanced chemi- 
luminescence (Amersham) according to the manufacturer's recommendations. 

Statistical Analysis. Because of the non-normal distribution of the samples 
and the small size of some subsets of tumors, the statistical evaluation was 
performed using nonparametric tests (SPSS, version 9.0). Comparison between 
mRNA expression levels in the different groups of NHLs was performed using 
the Kruskal-Wallis Test, with a P for significance set at 0.05. For differences 
between particular groups, the conservative Bonferroni procedure was per- 
formed, and the P was set at 0.005. The remaining statistical analyses were 
carried out. using the Mann-Whitney nonparametric U test (significance, P 
<0.05). The comparison between BMI-1 and p!6 1NK4a quantitative mRNA 
levels was also performed using the Pearson's correlation coefficient. 

Results 

BMI-I Gene Amplification. The BMI-J gene was examined by 
Southern blot in a large series of human tumors and normal samples 
(Table I). The eDNA probe used in the study detected three EcolU 
fragments of 7.3, 3.8, and 2.6 kb and three A/mdlll fragments of 6.2, 
4, and 3.5. kb. BMI-I gene amplification (3- to 7-fold) was detected in 
4 of 36 (1 1%) MCLs (Fig. 1). The amplifications were confirmed with 
both restriction enzymes. The amplified MCLs were two blastoid and 
two typical variants. No amplifications were observed in any of the 
solid tumors when compared with their respective matched non- 
neoplastic mucosa. No BMI-J gene rearrangements were observed in 
any of the samples examined. 

BMI-I mRNA Expression. To determine the BMI-1 expression 
pattern in NHL we analyzed BMI-1 mRNA levels by real-time quan- 
titative RT-PCR in 67 lymphomas (10 CLLs, 27 MCLs, 8 FLs, and 22 
LCLs), including the four tumors with gene amplification. A distinct 
BMI-I mRNA expression pattern was observed in the different types 
of lymphomas (Fig. 2; Kruskal-Wallis Test; P < 0.001). The BMI 
mRNA levels in CLLs (mean, 2.2 RU; SD, 1.3) and MCLs with no 
BMI-I gene amplification (mean, 2.5 RU; SD, 2.3) were significantly 
higher than in FLs (mean, 0.9 RU; SD, 0.8) and LCLs (mean, 0.6 RU; 
SD, 0.4; Mann-Whitney nonparametric U test; P < 0.01). The 4 
MCLs with BMJ-J gene amplification showed significantly higher 
levels of expression than all other groups of tumors (mean, 5.1 RU; 
SD, 1.6: P < 0.005). In addition, five typical MCLs with no structural 
alterations of the gene also showed very high levels of BMM mRNA 
expression ranging from 4 to 9.8 RU, similar to cases with gene 
amplification (Fig. 2A). 

BMI-I Protein Expression. BMI-I protein expression was exam- 
ined by Western blot in 31 tumors (7 CLLs; 12 MCLs. including two 
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cases with BMI-J gene amplification and 4 cases with mRNA over- 
expression and no structural alteration of the gene; 8 FLs, and 4 LCLs) 
in which additional frozen tissue was available. The monoclonal 
antibody against BMJ-I detected three closely migrating proteins of 
M r 45,000-48,000 (2). The two more slowly migrating bands prob- 
ably represent phosphorylated isoforms of the protein (12). The two 
MCLs with gene amplification and three of four cases with mRNA 
overexpression without amplification of the gene showed very high 
levels of protein expression. The remaining MCLs and CLLs showed 
intermediate levels of protein expression, whereas low- or no-expres- 
sion signals were detected in the LCLs and FLs included in the study 
(Fig. 3). These results indicate that BMI-1 protein expression in NHL 
is concordant with the mRNA levels observed by real-time quantita- 
tive RT-PCR. 

Relationship between BMI-J and p!6 ,NK4a Gene Alterations. 

The JNK4a/ARF locus has been recently identified as a downstream 
target of the transcriptional repressing activity of the BMJ-I gene, 
suggesting that this gene may contribute to human neoplasias with 
wild type INK4/ARF (5). Most of the lymphoproliferative disorders 
analyzed in the present study, including the four cases with BMI-J 
gene amplification, had been previously examined for p53 gene mu- 
tations and INK4a/ARF locus alterations, including gene deletions, 
mutations, hypermethylation, and expression (13, 14). The four MCLs 
with BMI-J gene amplification and mRNA overexpression and the 
five tumors with BMI-1 mRNA overexpression with no structural 
alterations of the gene showed a wild-type configuration of the 
INK4a/ARF locus (13). However, one case with BMI-I gene ampli- 
fication and one case with mRNA overexpression with no alteration of 
the gene showed p53 gene mutations associated with allelic deletions. 

To determine the possible relationship between BMI-1 and 
pl6 INK4a mRNA expression, pl6 INKt,a mRNA levels were evaluated 
by real-time quantitative RT-PCR in 50 tumors (10 CLLs, 27 MCLs, ■ 
and 13 LCLs), including 6 cases with alterations in the INK4a/ARF 
locus (2 MCLs and 1 LCL with pJ6 WK4a gene deletion, 2 LCLs with 
pi 6 promoter hypermethylation, and I CLL with />/<5 ,NK4a gene 
mutation), and the 4 lymphomas with BMI-I amplification. Negative 
or negligible levels of pl6 ,NK4a were observed in the 6 tumors with 
!NK4o/ARF locus alterations. These cases were not included in the 
comparisons between BMI-I and pl6 ,NKi,: ' mRNA expression. The 
pl 6 iNK4a ex p ress j on ] eV els were relatively similar in the different 
types of tumors. Only LCLs tended to have lower levels of expression, 
but the differences did not reach statistical significance (Fig. 2B). No 
differences were observed in the pl6 INK4a mRNA levels between 
tumors with BMI-I gene amplification and overexpression and lym- 
phomas with germline configuration of the gene. 
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Fig. 2. /L quantitative BMM mRNA transcript analysis (median and range) using 
real-time RT-PCR in a series of NHLs. MCLs with BMJ-J gene amplification {MCL*) 
revealed significantly higher overall BMI-1 mRNA levels than all other types of NHLs, 
including MCLs with no structural alterations of the gene (P < 0.005). MCLs and CLLs 
expressed significantly higher levels than FLs and LCLs (P < 0.001). Results are depicted 
as the ratio of absolute BMI-I.GADPH mRNA transcript numbers (RU). Bars, SD. B, 
quantitative pl6 ,NK4n mRNA transcript analysis (median and range) using real-time 
RT-PCR in a series of NHLs. Expression levels were relatively similar in the different 
types of tumors. Results arc depicted as the ratio of absolute pl6 JNK,, ":GADPH mRNA 
transcript numbers (RU). Bars, SD. 



Discussion 

In the present study, we have examined a large series of human 
tumors for the presence of gene alterations and mRNA expression of 
the BMI-1 gene. Gene amplification was identified in four MCLs. 
These tumors showed significantly higher levels of mRNA and pro- 
tein expression compared with other lymphomas with BMI-1 \n germ- 
line configuration. BMI-1 expression levels were also highly up- 
regulated in a subset of MCLs with no apparent structural alterations 
of the gene. No alterations were detected in any of the different types 
of carcinomas included in the study. BMI-1 is considered an oncogene 
belonging to the Polycomb group family of genes. These proteins 
mainly act as transcriptional regulators, controlling specific target 
genes involved in development, cell differentiation, proliferation, and 
senescence. Different studies have shown the implication of BMI-1 
ove rex press ion in the development of lymphomas in murine and 
feline animal models (3 ; 4). The findings of the present study indicate 
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for the first time that BMI-1 gene alterations in human neoplasms are 
an uncommon phenomenon, but they seem to occur mainly in a subset - 
of NHLs, particularly of mantle cell type. 

The human BMI-1 gene has been mapped to chromosome 10pl3. 
High-level DNA amplifications and gains in this region have been 
identified by comparative genomic hybridization in occasional solid 
tumors and NHLs (10, 11). Different chromosomal translocations 
involving the I Op 1 3 region have also been identified in infant leuke- 
mias and T cell lymphoproliferative disorders (7, 8, 15). Most acute 
leukemias with this chromosomal alteration occur in children <12 
months of age, whereas it seems to be extremely rare in adults. lOp 
translocations in T-cell lymphoproliferative disorders have been ob- 
served mainly in adult T cell leukemia/Iymphomas and occasional 
cutaneous T cell lymphomas. In our study, we did not observe BMI-1 
rearrangements or amplifications in any of the acute leukemias or T 
cell lymphomas. However, all of the acute leukemias in this study 
were diagnosed in patients over 16 years, and no adult T cell leuke- 
mia/Iymphomas or cutaneous lymphomas could be included in the 
series. Similarly, high-level DNA amplifications at the 10pl3 region 
have been detected in head and neck carcinomas and other solid 
tumors. Although we found no evidence for BMI-1 gene rearrange- 
ments or amplifications in a substantial set of carcinomas, this does 
not exclude the possibility of increased gene expression or protein 
levels in these tumors. Additional studies are required to elucidate the 
possible involvement of BMI-1 in these particular groups of human 
neoplasms. 

In human hematopoietic cells, BMI-1 is preferentially expressed in 
primitive CD34+ bone marrow cells, whereas it is negative or very 
low in more mature CD34- cells (16). In peripheral lymphocytes, and 
particularly in follicular B cells, BMI-1 protein expression has been 
detected in resting cells of the mantle zone, whereas ti is down- 
regulated in proliferating germinal center cells (17, 18). These obser- 
vations indicate that BMI-1 expression in normal hematopoietic cells 
is tightly regulated in relation with cell differentiation in bone marrow 
and antigen-specific response in peripheral lymphocytes. BMM ex- 
pression in human tumors has not been examined previously. In this 
study, we have demonstrated that BMI-I mRNA and protein expres- 
sion show a distinct pattern in different types of lymphomas. Thus, 
BMI-1 levels were low in LCLs and FLs and significantly higher in 
MCLs and CLLs. These findings suggest that BMI-1 expression 
patterns in B cell lymphomas maintain in part the expression profile 
of their normal cell counterparts; because FLs and at least a subgroup 
of LCLs are considered lymphomas derived from follicular germinal 
center cells, whereas MCLs and CLLs are tumors mainly derived from 
naive prcgerminal center cells. However, the four MCLs with BMI-1 
gene amplification expressed significantly higher mRNA levels than 
all other tumors. In addition, five MCLs with no structural alterations 
of the gene showed high mRNA levels similar to those observed in 
tumors with BMI-I gene amplification, suggesting that other mecha- 
nisms may be involved in up-regulation of the gene in these lympho- 
mas. Different studies using animal models have shown a dose- 
dependent effect of BMI-I gene expression on skeleton development 
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Fi^. 3. Western blot analysis of RM1-I protein in NHLs. The amplified MCL (17624) 
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and lymphomagenesis (1, 3). These observations suggest that the high 
mRNA and protein levels detected in a subset of MCLs may play a 
role in the pathogenesis of these neoplasms. 

Recent studies have identified the INK4/ARF locus as a down- 
stream target of the BMI-l transcriptional repressor activity, suggest- 
ing that BMI-l overexpression may contribute to human neoplasias 
that retain the wild-type lNK4a/ARF locus (5). Interestingly, in our 
study, BMI-l amplification and overexpression appeared in tumors 
with no alterations in pJ6 lNK4a and#/4 ARF genes. However, we could 
not detect differences in the expression levels of pl6 INK4t ' ) in tumors 
with and without BMI-I gene alterations. The reasons for this apparent 
discrepancy with experimental observations are not clear. One possi- 
bility may be that genes other than lNK4a/ARF are the main targets of 
BM1- 1 repressor activity in these tumors. Particularly, different genes 
of the HOX family are regulated by BMI-l and may also be involved 
in lymphomagenesis (19, 20). 

In conclusion, the findings of this study indicate that BMI-l gene 
expression is differentially regulated in B cell lymphomas. Alterations 
of the gene seem to be an uncommon phenomenon in human neo- 
plasms, but they may contribute to the pathogenesis in a subset of 
MCLs. Although, BMf-J gene alterations occurred in tumors with 
wild-type lNK4a/ARF locus, the possible cooperation between these 
genes and the oncogenic mechanisms of BMI-I in human neoplasms 
require additional analysis. 
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Gene Expression Analysis 
Using Microarrays 

Sophie E. Wildsmith and Fiona J. Spence 



13.1 Introduction 

Microarrays are of increasing interest to both industry and academia as tools 
for 'gene hunting' and also as quantitative methods for routine analysis of large 
numbers of genes. Techniques such as real-time polymerase chain reaction (RT- 
PCR) TaqMan™ and SybrMan™ are generally considered to be more accurate, 
robust, larger in dynamic range and less capital intensive, but for rapid, large- 
scale gene expression analysis using limited mRNA, microarrays and gene chips 
are preferred. 

13.2 Microarray experiments 
Platforms 

Global gene expression platforms are now available in multiple formats, 
including cDNA arrays, oligonucleotides spotted onto slides or in situ 
synthesised oligonucleotide arrays manufactured using photolithography. 
Commercial sources for these include Stratagene (La Jolla, CA), Memorec 
(Kttln, Germany) and BD Biosciences (Oxford, UK) for cDNA microarrays, 
Mergen Ltd (San Leandro, CA) for spotted oligomers and Affymetrix (Palo 
Alto, CA) for oligoarrays synthesised in situ. Purchasing from a supplier is more 
expensive than generating microarrays in-house, although the latter is beneficial 
in labour-intensive institutions or when proprietary gene information is utilised. 
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'Off-the-shelf microarrays may also have the advantage of rigorous quality 
control and standardised protocols. 

It is possible to produce oligonucleotide spotted arrays in-house, by design- 
ing oligonucleotide sequences that match genes of interest and then purchasing 
purified oligonucleotides to spot down on glass or other substrates. Alternatively 
there are new systems such as that available from CombiMatrix (Mukilteo, 
WA, USA) for computer-aided design and in situ synthesis of oligonucleotides. 
However, production of cDNA microarrays is currently the most affordable and 
popular method and is now well established. Numerous sources of informa- 
tion on cDNA microarray fabrication are available in the literature and on the 
internet (Bowtell, 1999; Cheung etaL t 1999; Wildsmith and Elcock, 2001 and 
http://cmgm.stanford.edu/pbrovm). Thus, this chapter will focus on the 
implementation of experiments and analysis of data from cDNA microarrays. 
The experimental procedure differs slightly according to the number of fluo- 
rophores (or channels) and the type and manufacturer of the array. We have 
attempted to describe a generic process, indicating where possible the different 
options. Figure 13.1 demonstrates the procedure for a two-colour hybridisation. 



Sample 1 Sample 2 




Figure 13.1 The microarray experimental process for two-colour hybridisations 



RNA Extraction 
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First, RNA is extracted from the tissue or cells of interest. The quality of the 
RNA extracted is paramount to the overall success of the microarray experi- 
ment, as impurities in the sample can effect both the probe labelling efficiency 
and also stability of the fluorescent label (Hegde etal. 2000). Snap-freezing 
of tissue in liquid nitrogen, immediately after harvesting, is used to preserve 
RNA integrity. Any farther sectioning of the tissues should be carried out under 
RNAse-free conditions (Fernandez et a/ M 1997). Total RNA can be extracted 
using kits such as TRIzdl® (Invitrogen, Paisley, Scotland) and Rneasy (Qia- 
gen, GmbH, Hilden, Germany). Some researchers perform a further extraction 
of mRNA; this results in a purer starting material but has the disadvantage of 
lower yields. Affymetrix recommend between 5 and 40 p,g of total RNA is 
required for their GeneChips™ and 10 *ig or less is the required amount of 
starting material for cDNA microarrays (Hegde et al t 2000). 



.Sample .Labelling 

The mRNA is transcribed in vitro, with the concomitant inclusion of labelled 
nucleotides. The labels may be fluorescent or radioactive. In the case of dual 
channel/colour hybridisations, two samples will be labelled with dyes that 
fluoresce at different wavelengths, with different emission spectra. Example 
fluorophores, available coupled to nucleotides, are Cy3, Cy5, fluorescein 
and lissamirte. Wildsmith etal (2001) have demonstrated that AlexaFluor 
546dUTT™ (Molecular Probes, Leiden, The Netherlands) gives a significantly 
higher signal than Cy3dCTP (Amersham Biosciences, Piscataway, NJ, USA). 
When performing two-colour hybridisations the control sample and 'test* sample 
are labelled with different fluorophores and the subsequent cDNA is then mixed 
together and hybridised simultaneously (Nuwaysir et aL t 1999). An advantage of 
simultaneously hybridising control and treated sample is that it obviates the need 
to control for differences in hybridisation conditions or between microarrays. A 
specific example of the huge impact this technique has had includes its use 
in the first published account of gene expression data of the entire genome of 
Saccharomyces cerevisiae (DeRisi et aL. 1997). In two-colour hybridisations, 
one would assume that the properties of the two fluorescent dyes being used 
are equivocal In fact, for Cy5 and Cy3 this is not the case as Cy5 has been 
reported to give higher background fluorescence and also is more sensitive to 
photobleaching than Cy3 (Van Hal et aL, 2000). In addition, there is evidence 
from several independent sources that the combination of Cy3 and Cy5 dye 
labelling can affect data in certain genes. That is to say, when experiments 
are repeated and the dye combination for the two probes reversed, inconsistent 
results are obtained with certain genes (Taniguchi etal, 2001). Despite these 
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Treated : Cy 3-532 nm Control : Cy 5-635 nro 




Figure 13.2 Two-colour fluorescent scan of human gene cDNA array. The probe mix con- 
sists of DNA from HepG2 control cells and cells treated with buthionine sulfoximine for 6 h. 
A colour version of this figure appears in the colour plate section 

facts, two-colour hybridisations are widely accepted throughout the microarray 
community and an example of an image is shown in Figure 13.2. 

Hybridisation and Processing 

After labelling, the cDNA is purified (to remove unincorporated nucleotides), 
mixed with a hybridisation bufifer and then applied to a cDNA microarray slide. 
The sample and the slide are heated prior to hybridisation in order to separate 
double-stranded DNA. A coverslip is applied (Shalon et al. t 1996), or prefer- 
ably a hybridisation chamber is used to avoid evaporation and enable an even 
hybridisation. The hybridisation and subsequent wash steps are carried out at a 
buffer stringency and temperature that enables hybridisation of complementary 
strands of DNA but reduces non-specific binding. 

Image Capture and Image Analysis 

After hybridisation the microarray slides are scanned, using either a laser or a 
phosphorimager (depending on the type of label used). There are many different 
suppliers and models of fluorescence scanner, for example the ScanArray 5000 
(Perkin Elmer Life Sciences, Zaventem, Belgium), GenePix 4000B (Axon GRI, 
Essex, UK) and the GeneArray® (Affymetrix, Santa Clara, CA). The choice of 
scanner is determined by sensitivity, resolution, flexible wavelength, file size 
generated, throughput and technical support available. 

Images are analysed using software that measures the intensity of the sig- 
nal from the hybridised spotted genes (spots), which provides a measurement 
of the amount of cDNA bound. Thus the initial concentration of messenger 
RNA is inferred. Early software packages 'drew* grids around the spots and 
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integrated across the whole area of the grid. This overcame problems associ- 
ated with accurate location of the spots, which is problematic, especially if the 
spots on the printed arrays are poorly aligned. More recent versions of software 
'draw' circles around the spots themselves and perform measurements within 
and outside of this boundary. For example, the background may be calculated 
from a region outside the spot boundary. The intensity of the signal from the 
spot may be calculated using median, mode or mean values of the pixels within 
the spot. Researchers differ in their preferences regarding using median or mean 
values (Hegde et al. f 2000) and this is likely to depend upon the protocols and 
software used. 

Image analysis software commonly is supplied with scanners or can be 
bought from the same supplier. This has the advantage of being opti- 
mised for that specific type of microarray and the benefit of upgrades 
and technical support. Software for microarray analysis is available 
from BioDiscovery (http://www.biodiscovery.com), Imaging Research 
(http://imagingre9earch.com), GenePix Pro (Amersham Biosciences, Pis- 
cataway, NJ), arraySCOUT 2.0 (http://www.lionbioscience.com), NM 
(http : //www . nhgri . nih . gov/DIR/LCG/lSK/HTML/img^analysie .html), 
Stanford University (http: //rana .Stanford.EDU/ software) Media Cyber- 
netics (Silver Spring, MD, USA) and TIGR (http/ /www. tigr.org/sof tlab). 
Important criteria for image analysis software include speed, ease of use, 
automation and the ability to distinguish artefact from real signal (Wildsmith 
and Elcock, 2001). 

As the technology has evolved and more experience gained, it has become 
more and more apparent that the most significant issues facing microarray users 
are the processing of the vast quantities of data generated and deciding exactly 
what tools are the most appropriate for data analysis. Because of the enormity 
of this, we have dedicated a complete section to describing the current status of 
this area. 



13.3 Data analysis 

It is important to be cognisant of the fact that the practical laboratory aspects of 
using microarrays are only part of gene expression analysis. Many researchers 
generate vast volumes of data, without a clear understanding of how to manage 
and interpret them. Furthermore, the variability in microarray data confers addi- 
tional problems for analysis. In some cases the purpose of the experiment will 
be a gene-hunting exercise, in which case a cursory indication of potential gene 
biomarkers is sufficient analysis. In other instances, such as pathway mapping 
and screening studies, it is paramount that results are statistically meaningful 
and valid. The next few sections detail some relatively simple analysis meth- 
ods and recommendations for the benefit of researchers with minima! statistical 
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Figure 13 J The ideal microarray experimental design and process 

training. There are also suggestions for more advanced analysis for those who 
have the assistance of a statistician or specialist data analyst. 

Most of the steps before, during and after performing a microarray experi- 
ment are optimally conducted with regard for statistics and data analysis. Careful 
planning before implementation facilitates the downstream analysis and inter- 
pretation of data. The following model summarises the entire microarray process 
with integration of the biological and data analysis components (Figure 13.3). 

Hypothesis Generation 

Any study is conceived for the purpose of investigating or obtaining supporting 
evidence for a biological hypothesis. Giving time at this early stage to consider 
downstream implications will pay dividends later. It is helpful if, rather than 
simply stating the aims of the experiment, the researcher asks the question 'What 
results do I expect?' or 'what answer will validate/invalidate my hypothesis?'. 
This •reverse-engineering' proves useful in focusing the project, assessing the 
feasibility of the work, providing early preparation for data management and 
analysis and, importantly, in managing expectations with regard to outcomes. 

A good example of careful experimental planning is demonstrated by Golub 
etal. (1999) in the classification of acute leukaemias in order to distinguish 
between acute lymphoblastic leukaemia (ALL) and acute myeloid leukaemia 
(AML). Distinguishing between ALL and AML using conventional techniques 
is known to be a difficult task. The researchers maximised their probability of 
success by choosing an easier, more defined model (normal kidney vs. renal 
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cell carcinoma), on which to validate their analytical methods. In doing so they 
established that their techniques were suitable for classifying tissues according 
to disease and gained confidence in their approach before using the samples of 
real interest. 

Optimisation Experiments 

Although microarrays are becoming increasingly accessible to all, using these 
tools requires experience and it is unlikely that successful experiments will be 
conducted immediately. It is usual that some time is given to optimising a sys- 
tem for any specific application, for example for a given tissue or cell type. 
Additionally, the requirements for a given system may warrant some modifica- 
tions. The standard approach for a scientist to take is to vary one parameter, 
whilst keeping all others constant, This is time-consuming and does not take into 
account the interactions between different factors. Well-designed, multifactorial 
experiments (Box et al. f 1978), provide a faster route for optimisation, with a 
statistical measure of confidence. An example of this technique is in the opti- 
misation of microarray experimental conditions for preparation , of fluorescent 
probes from rat liver tissue (Wildsmith et al, 2001). When a major source of 
variation is revealed this can be investigated further with a view to minimising 
it or providing sufficient replicates to account for it. 

Design of Experiments 

Once confidence in the experimental procedure has been obtained the researcher 
is likely to have gained an insight into the reproducibility of the system. This 
assists in the design of the experiments, in particular in determining the mini- 
mal number of replicates necessary. Replication can be implemented at many 
stages - from biological samples through to microarray slides. 

Owing to the enzyme-catalysed transcription reactions, a large amount of 
variation occurs during the probe-making stages in microarray experiments. Our 
work indicated that replicates should be made at this step and a minimum of six 
replicate probes are made for microarray experiments (Wildsmith et al, 2001). 
These can be pooled or hybridised separately onto six microarray slides. 

Lee et aL (2000) have examined the effect of the different location of cDNA 
spots on the glass slides and concluded that replicates are essential to provide 
meaningful data and to enable reliable inferences to be drawn. 

With regard to commercially available gene chip systems, such as that avail- 
able from Affymetrix (see Figure 13.4), the variation between chips, within a 
batch, is likely to be low due to stringent quality control and highly automated 
manufacture. The use of an automated wash station also reduces variability in 
intensities between chips. However, using a multi-step approach in the probe 
preparation and subsequent antibody binding steps may lead to variation between 
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Figure 13.4 The GeneCbip® Instrument System. From left to right, the hybridisation sta- 
tion, scanner and workstation. Image courtesy of Asymetrix. A colour version of this figure 
appears in the colour plate section 



replicate, samples prepared on different days. Pooling of reagents within an 
experiment, and analysing controls together with "treated samples, will both 
reduce the variability within a given experiment. 

Conduct of Experiment 

At this stage some attention may be required for verifying and validating pro- 
cesses. For example, checking that the imaging instruments give consistent 
results across the slide, on repeat use and from day to day. If two imagers 
are used it is important to verify that the results from both machines are com- 
parable. Some laboratories read fluorescence of one channel and then adjust the 
laser intensity of the second channel in order to obtain comparable readings. 
This is a method of normalising for the difference in intensities of fluorophores. 
It is important to be aware that this approach has a number of drawbacks. The 
arbitrary value of the second laser intensity setting will vary from experiment 
to experiment; thus comparisons of this channel cannot be made across experi- 
ments. Also the response of the fluorophore may not be linear across the laser 
intensity settings and this can lead to additional errors. 

Another area for investigation prior to running the study itself is the image 
analysis component. Depending on the software used, the image analysis pack- 
age may process the data to some extent, for example automatic background 
subtraction. Full understanding of the software is required so that it is clear at 
what point the data are 'raw', and the extent of inherent, inseparable manip- 
ulation. Effort may be required to determine the optimum settings for any 
software parameters. 

As data are generated it is important to be aware of the data integrity - for 
example ensuring that all data are collected, so that there are no missing data that 
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Figure 13.5 The relative fluorescence units (RFUs) of 1248 genes on seven microar- 
ray slides that were . hybridised with cDNA made from. the liver of. a .rat. treated with 
acetaminophen. Note the gene outliers at approximately gene number 480 

can complicate analysis later. The researcher may be intuitively aware of any 
spurious results and should be alert for anything extraordinary that could indi- 
cate problems, for example hybridisation intensities appearing inconsistent from 
sample to sample. Data analysis at this point can be a rapid indicator of dubious 
results. For example, Figure 13.5 shows a plot of the fluorescent intensities of 
1248 genes that were hybridised with probe derived from acetaminophen-treated 
rat liver tissue. The data appear consistent, with the exception of peaks in inten- 
sity on one slide at around gene 4800. Further investigation of the microarray 
revealed a large artefact that had been missed by the image analysis process 
(Figure 13.6). 

Raw Data Generation and Storage 

One issue that arises when carrying out microarray analyses is how much data 
to store and in what form. For Good Laboratory Practice (GLP) purposes, often 
required in industry, storage of the raw data is necessary. This could be construed 
as the microarray image. Storing the image analysis results requires far less 
storage space and is easier to visualise, but it has the drawback that image 
analysis cannot be redone should superior software be available in the future. 
In reality, the methods used for microarrays are continually changing and the 
likelihood of revisiting old images on which the analysis has been performed, 
using outdated protocols is quite small. 
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Figure 13.6 Portion of scanned image showing region where artefact occurred that caused 
very high signals, which were classed as outliers 



Pre-processing 

A number of pre-processing steps are often used in microarray analysis. These 
include filtering, log transformation, normalisation and background subtraction. 
Filtering may be used before or after transformation in order to extract data 
from preferred regions of interest, or in order to remove outliers (see the 
above example relating to image analysis artefact). One example of filtering 
is the removal of individual gene replicates that lie outside a given number (for 
example 5) of standard deviations from the mean. Alternatively, data points that 
lie in the top/bottom few percentiles (e.g. 0.1%) of the data can be removed. This 
method of removing outliers is also called 'trimming'. It is acceptable if there 
is a large volume of data where only a small proportion of data is removed and 
if the same method is applied consistently across all data. Care must be taken 
in the way in which this is carried out in order not to delete genuine data. For 
example, if one gene is consistently high or low in expression across replicates, 
then it is unlikely to be an outlier. 

Another method of detecting outliers is to plot all genes (see the section 
'Conduct of experiment' above) or to perform PCA analysis (see the section 
'Multivariate analysis' below) to detect replicate outliers. The use of a PCA 
plot to detect outliers is shown in Figure 13.7. 

Log transformation of data is accepted universally because the fluorescence 
data that are generated from microarrays tend to be skewed towards lower val- 
ues. There are scientifically valid reasons why ratios of raw expression values 
should not be used (Nadon and Shoemaker, 2002). When using two-colour 
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Figure 13.7 PCA plot of data used in Figure 13.4 showing one microarray (number 2) as 
an outlier 



hybridisations it is common to express the ratio of treated to control as a loga- 
rithm in base 2 (Quackenbush, 2001). Thus genes up-regulated by a factor of 2 
have a log2 (ratio) of 1, and genes down-regulated by a factor of 2 have a log 2 
(ratio) of —1. 

Normalisation and background subtraction techniques are methods of data 
manipulation and their use is more subjective and often debated. The purpose of 
these techniques is to reduce the error (variability) that occurs between replicates 
and thus enable a comparison of data across samples. 

The theory behind background subtraction is that during hybridisation there 
will be non-specific binding to the slide. This will effectively 'darken' the 
image and give falsely high readings of fluorescent intensity. Correcting for 
the non-specific hybridisation should reduce error due to background staining. 
Background subtraction often occurs automatically in microarray image analy- 
sis packages. The software may circle the spot of interest and use the region 
beyond the periphery as the measurement of background. In cases of uneven 
hybridisation this method enables locally high background to be subtracted on 
a regional basis. One criticism of this approach is that the slide surface beyond 
the periphery is not similar, in chemical terms, to that where the nucleic acid 
has been deposited, and therefore cannot act as a real control for non-specific 
binding. A more accurate measurement of non-specific binding can be gained 
from using a region where spotting chemicals have been deposited, but no target 
is present. This concept is the basis of a method using local 'blank spots' (Wu 
etaL t 2001). 

A number of methods exist for normalisation of data. These include normal- 
ising to total signal or to a 'known' spot or gene, standardisation, or proprietary 
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methods. Normalising to total signal is the simplest approach, whereby the gene 
intensity is expressed as a percentage or proportion of the signal intensity for the 
entire array. This method works best when the total intensities for the microar- 
rays are similar and the number of changes is small compared with the number 
of genes. However, we often find, when using arrays of around 1000 genes, that 
pathological disease can up-regulate a large number of genes simultaneously. In 
this case, when total signal normalisation is applied, highly up-regulated genes 
will appear less up-regulated and genes that do not change from the control will 
appear down-regulated. 

Normalisation to a control value is a more popular technique. A control value 
can be obtained from using a gene known to remain constant under the condi- 
tions of the experiment. DeRisi etaL (1997) used a panel of 90 housekeeping 
genes for. normalisation, but found considerable variation in their gene expres- 
sion. Unfortunately it is very difficult to know with certainty that a gene will 
not change and there is evidence to suggest that so-called 'housekeeping genes' 
are variable (Savonet et al. t 1997). Other control genes can be derived from an 
alternative species; these should not be expected to hybridise. We have used 
yeast and Arabidopsis genes as negative controls for hybridisations of rat tis- 
sues. No orthologs were known to die genes selected; however, in most cases 
non-specific binding occurred. 

If two or more microarray replicates appear to be different, but they are 
expected to be the same, then they can b$ standardised. An example of this 
might occur if the total intensity of one microarray is greater than another, but 
the genes are proportionally equally up- or down-regulated. If the microarray 
sample spot data are assumed to be drawn from a normal distribution, then the 'z- 
transform' can be used. This requires that the mean and the standard deviation of 
the intensity values for each microarray are determined. The mean is subtracted 
from each individual gene value and the remainder is divided by the standard 
deviation. The intensity values from each microarray will then have the same 
mean and standard deviation. This has the advantage of facilitating comparison 
of microarrays with different dynamic ranges as well as total intensities. If the 
data are not normally distributed, then alternative non-parametric methods can 
be used, such as normalising to the median. 

Univariate Analysis 

Univariate methods of analysis involve examining one variable, or gene, at a 
time. This can be a very laborious task when examining a large volume of 
data, yet it is the preferred method of biologists. The simplest technique is to 
compare control and treated values and express the result as a 'fold-change* 
ratio. Typically, when examining small volumes of data, fold changes greater 
than 2 and less than 0.5 are considered meaningful (Quackenbush, 2001). This 
cut-off is essentially arbitrary and has the distinct drawback that microarray data 
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are not homoscedastic; that is, there is more variation about the mean at low 
values than there is at high values (Draghici, 2002). 

A second method for finding up- or down-regulated genes uses the standard 
deviation (SD) of the replicate gene data. Thus if changes greater than, say, 2 
SD from the log mean ratio are considerably greater than changes associated 
with 'noise', then they are considered significant This technique means that 
when looking at a large number of genes that are normally distributed there will 
be up- and down-regulated genes, regardless of whether there are (biological) 
changes (Draghici, 2002). 

Rather than using arbitrary cut-off values it is far more meaningful to express 
the fold-change in terms of either confidence intervals, or a 'p-value' (that 
is, the probability of the value occurring by chance). Thus, a fold-change of 
1.1 may be associated with a p- value of 0.001 and thus the probability that 
the gene is not up-regulated is 1 in 1000. Naturally, such small fold-changes 
may then be queried in terms of biological significance. One must then ask 
the question: Are large fold-changes more important (biologically) than small 
ones? Simple calculation of p-values for two data sets can be obtained using 
r-test functions in standard spreadsheet software. A number of replicates are 
necessary" for this approach, and the data must be normally distributed. We 
have recently developed a method for calculating p-values for fold-changes that 
is not influenced by the distribution of the data or outliers and applied it to 
TaqMan™ and microarray data. Other complex and computationally intensive 
methods for calculating p-values are described in Draghici (2002) and Nadon 
and Shoemaker (2002). 

Multivariate Analysis 

Multivariate analysis of gene expression data is becoming increasingly popu- 
lar in the microarray community and in other biological domains where large 
volumes of data are generated. Multivariate analysis methods include princi- 
pal component analysis (PCA), factor analysis, multivariate analysis of variance 
(MANOVA) and cluster analysis. Currently, cluster analysis is the most widely- 
used method in the microarray community but PCA is growing in popularity 
(Crescenzi and Giuliani, 2001; Konu et al. f 2001). 

Quackenbush (2001) provides a good review of clustering tools that is rather 
unique in the regard that different clustering algorithms and linkage methods 
are presented. Clustering methods are unsupervised, and they are powerful tools 
for gaining insight into huge data sets. They enable the data to be partitioned 
in order to facilitate interpretation; however, they do suffer from subjectiv- 
ity. This is because the user selects various parameters, such as the algorithm 
used, linkage type, distance metric and, sometimes, cluster size. Whatever the 
data, clusters will always be identified, thus there is also a tendency to over- 
interpret the data - trying to attach meaning to clusters that may have no bio- 
logical relevance. 
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Software available for cluster analysis includes Cluster, the output of which 
is viewed in Treeview; both available from http://rana.stanford.edu/ 
software. This tool is particularly useful for clustering genes to identify genes 
that are co-regulated. 

The PCA is a visualisation tool that enables complex, high-dimensional data 
to be represented in two or three dimensions. It facilitates identification of groups 
of similar data, thus enabling inferences to be made about the samples. 

An example is shown in Figure 13.8. The figure shows the gene data for one 
sample (control rat liver) that was hybridised to seven microarrays according 
to the method used in Wildsmith et al. (2001). Each microarray contained two 
replicate gene sets; thus there were 14 replicate gene sets in total. The gene sets 
comprised 1248 genes (with controls). All the data (14 x 1248 data points) was 
input into the analysis and the PCA plot displays the 14 replicates individually. 
The two axes are principal components 1 and 2. Principal component 1 (PCI) 
accounts for 65.5% of the variation in the data, whereas PC2 represents only 
13%. This means that the model accounts for 78.5% of variation in the data. 

The first principal component (PCI) accounts for as much as possible of 
the variation in the.original data and subsequent components (e.g. PC2) are of 
decreasing importance. Thus, samples 8 and 9 archery different from samples 1 
and 2. In terms of interpreting the PCA plot, it is immediately clear that there are 
three or four distinct clusters of data. These are marked by circles. Datapoints 
tend to cluster in pairs; for example replicates 1 and 2, 3 and. 4, 5 and 6, 
etc. These are the duplicate gene sets on the same microarray. This indicates 
that the variation within the microarray is lower than the variation between 
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Figure 13.8 PCA plot of the microarray results from seven slides (2690, 2692, 2723, 2869, 
2876, 2879, 2880), each with two replicate spot sets (labelled I - 14), after hybridisation with 
control rat liver. See text for explanation 
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replicate microarrays. However, given that the same sample is applied to all 
the microarrays, we must ask why we get further separation of the replicates. 
The answer lies in the associated data. The four-digit numbers associated with 
the clusters are the microarray slide numbers that indicate when they were 
printed. Numbers that are more similar seem to be more closely related, and thus 
we can hypothesise that there were some differences between the slides, such 
as differences in slide backgrounds or changes during the printing process or 
change-over of batches, between slides in the 2600-2700 region and the 2800s. 

The PCA provides a clearer overview of the data than does cluster analy- 
sis. It is a rapid method for gaining an insight into the results, in particular 
where biological meaning can be attached to the components (Crescenzi et a/., 
2001). There are a number of packages for multivariate data analysis, including 
SIMCA-P (Umetrics, AB, Umea, Sweden) and The Unscrambler (Camo ASA, 
Norway), both of which are useful for PCA. 

Other tools for data visualisation include software packages such as Spot- 
fire.net (Spotfire Inc., Cambridge, MA, USA) and GeneSpring (Silicon Graphics, 
San Carlos, CA, USA). Spotfire is particularly useful for visualisation of mul- 
tidimensional data and for visualisation of temporal data. It is possible to use 
these tools to identify genes (hat are co-ordinately expressed over time. 

Biological Interpretation 

After developing a sound experimental strategy, ensuring that the results are 
statistically valid, and after analysis of the data, it is down to the biologist to 
assemble the pieces of information that have been obtained. This intertwined 
information may include unexpected results that are contradictory to intuition or 
to published literature. One way to untangle the data is to map the relevant genes 
onto existing pathways and known functions. The Kyoto Encyclopedia of Genes 
and Genomes (KEGG), available at http://www.genome.ad.jp/kegg/, is a 
useful source of information, especially where the gene products are enzymes. It 
enables visualisation of the position of up- or down-regulated genes in metabolic 
pathways. 

The gene expression data obtained may differ from protein expression data, 
or information on gene product activity or location. When initiating a study 
it is useful to consider additional endpoints that can assist in the interpreta- 
tion of the data. For in vitro studies, these might include cytotoxicity endpoints, 
metabolites, key signalling molecules or perhaps protein expression. Waring 
et aL (2001) used tetrazolium dye reduction (MTT) as a measure of hepatocyte 
cell viability for their studies of gene expression in response to hepatotoxic 
insult. For in vivo studies, expression information on the tissue of interest could 
be supported by pathology, histology and blood chemistry measurements. Gene 
expression results could be confirmed by in situ hybridisations or protein activity 
assays. 
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13.4 Recent examples of microarray applications 

One area of rapid progress using microarray technology is the increased under- 
standing of cancer. Molecular pathologists are subgrouping cancers of tissues 
such as blood, skin and breast, based on differential gene expression patterns. 
For example, within a small group of breast cancer tissue samples, Perou et al. 
(2000) distinguished two broad subgroups representing those expressing or alter- 
natively lacking expression of the oestrogen receptor-ct gene. The work was not 
conclusive, but never has progress in this field been so rapid when compared 
with the previous methods of gene identification. 

Another example of the impact of this technology is in the identification of 
two biomarkers for prostate cancer, namely hepsin and PIM1 (Dhanasekaran 
<?fa/., 2001). 

Microarray technology has also accelerated the understanding of the molecu- 
lar events surrounding pulmonary fibrosis. Specifically, two distinct clusters of 
genes associated with inflammation and fibrosis have been identified in a dis- 
ease where, for years, the pathogenesis and treatment have remained unknown 
(Katsuma etaU 2001). 

13.5 Conclusions 

Important factors in gene expression experiments include sensitivity, precision 
and reproducibility in the measurement of specific mRNA sequences (Schmittgen 
etal, 2000). These quality metrics can be maximised by using, or fabricat- 
ing, high-quality microarrays, and by optimising each step of the microarray 
process. From conception to conclusion it is important to bear in mind the 
original hypothesis. 

Having considered the complexity of the microarray experiment, the value 
obtained from a meticulously designed experiment should not be underesti- 
mated. As the number of high-quality gene expression studies increases, we hope 
that the literature will contain increasingly detailed information that will help 
interpret complex gene expression changes, and thus elucidate the mechanisms 
of disease. 
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Amplification and overexpression of putative oncogenes 
confer growth advantages for tumor development. We 
used a functional genomic approach that integrated 
simultaneous genomic and transcript microarray, proteo- 
mics, and tissue microarray analyses to directly identify 
putative oncogenes in lung adenocarcinoma. We first 
identified 183 genes with increases in both genomic copy 
number and transcript in six lung adenocarcinoma cell 
lines. Next, we used two-dimensional polyacrylamide gel 
electrophoresis and mass spectrometry to identify 42 
proteins that were overexpressed in the cancer cells 
relative to normal cells. Comparing the 183 genes with 
the 42 proteins, wc identified four genes - PRDXI, 
EEF1A2, CALR, and KCIP-I - in which elevated protein 
expression correlated with both increased DNA copy 
number and increased transcript levels (all r>0.84, two- 
sided A><0.05). These findings were validated by South- 
ern, Northern, and Western blotting. Specific inhibition of 
EEF1A2 and KCIP-1 expression with siRNA in the four 
cell lines tested suppressed proliferation and induced 
apoptosis. Parallel fluorescence in situ hybridization and 
immunohistochemical analyses of EEFIA2 and KCIP-I in 
tissue microarrays from patients with lung adenocarcinoma 
showed that gene amplification was associated with high 
protein expression for hnth genes an d that pr o te i n - 



overexpression was related to tumor grade, disease stage, 
Ki-67 expression, and a shorter survival of patients. The 
amplification of EEFJA2 and KCIP-I and the presence of 
overexpressed protein in tumor samples strongly suggest 
that these genes could be oncogenes and hence potential 
targets for diagnosis and therapy in lung adenocarcinoma. 
Oncogene (2006) 25, 2628-2635. doi: 10. 1038/sj.onc. 1 209289; 
published online 12 December 2005 

Keywords: lung cancer; microarrays; proteomics; tissue 
microarray 



Introduction 

in lung adenocarcinoma, as in other types of cancer, 
gene amplification and the consequent overexpression of 
the amplified oncogene play an important role in the 
development of tumors, because their overexpression 
confers a growth advantage. The ability to identify 
putative oncogenes that are activated during turn oogen- 
esis could facilitate the choice of molecular genetic 
targets for diagnosis and therapy of the disease. This 
concept has been exemplified by HER-2 y which was first 
found to be amplified in neuroblastomas and subse- 
quently shown to be associated with poor prognosis in 
breast cancer (Ross and Fletcher, 1999). Now, HER-2 
aberrations are used as a predictor of response to 
therapy, and treatment of HER-2-positive breast cancer 
with the monoclonal anti-HER-2 antibody trastuzumab 
has been shown to improve prognosis (Ross and 
Fletcher, 1999). Emerging evidence of common ampli- 
cons in lung adenocarcinomas (Luk et a/., 2001; Jiang 
et al. y 2004; Tonon et al. % 2005) suggests that additional 
oncogenes remain to be identified; however, conven- 
tional techniques are ineffective in pinpointing such 
oncogenes. Parallel measurement of DNA copy number 
a nd mRNA l evel s i n cDNA microarray s permit s— 
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changes in copy number to be compared with transcrip- 
tion levels on a gene-by-gene basis to generate lists of 
candidate genes within the defining amplicons (Hyman 
et a/.,; 2002; Pollack et a/., 2002). However, use of 
transcript patterns does not allow assessment of the 
expression of protein products or identification of prolo- 
oncogenes. Another approach, identifying differentially 
expressed proteins by proteomic analysis and then 
comparing the proteins present with mRNA expression 
in cDNA microarrays from the same specimens, can 
clarify the extent to which changes in transcript patterns 
reflect changes in their cognate proteins and post- 
transcriptional mechanisms (Chen et «/., 2002), but this 
approach cannot be used to identify oncogenes driven 
by extensive increases of their gene copy number. 
Moreover, using individual microarrays or proteomic 
approaches alone cannot distinguish the cancer-driving 
oncogenes that directly propel tumor progression from 
the larger number of passenger genes that may be 
concurrently over-represented but are not biologically 
relevant in tumor development. 
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In this study, we used a comprehensive approach that 
integrated simultaneous comparative genomic hybridi- 
zation (CGH) and transcript microarray with proteomic 
analyses of six lung adenocarcinoma cell lines. We 
directly and specifically identified four putative onco- 
genes that could have been activated through amplifica- 
tion and consequent elevation of transcript expression. 
We used small interfering RNA (siRNA) to inhibit the 
expression of two of these four genes in the lung cancer 
cell lines, which further implicated- them in oncogenesis. 
We then explored the clinical significance of these 
findings by assessing the expression of these two genes 
in tissue microarrays of human lung cancer specimens. 
Our findings underscore the power of integrated 
functional genomic analyses for identifying putative 
oncogenes in lumorigenesis; such activated genes could 
be useful as targets for diagnosis or therapy in lung 
cancer. 



Results 

Simultaneous global genomic and transcript analyses 
identify 183 genes with increases in genomic copy 
numbers and transcript expression levels 
To identify genes in which increased DNA copy number 
might contribute to increased transcript in lung adeno- 
carcinomas, first we used CGH with microarrays of six 
lung adenocarcinoma cell lines. We identified 587 genes 
showing increases in DNA copy number across all six 
cell lines (Supplementary Table IS), which were 
distributed as 90 amplicons on all chromosomes except 
for chromosomes 13 and Y (Supplementary Table 2S). 
A subsequent transcript test with the identical arrays of 
the same cell lines revealed 275 genes that showed 
increased mRNA levels (Supplementary Table 3S). 
Using random permutation tests across all cancer cell 
lines, we identified 183 genes (31%) that showed 
elevated transcript levels from the 587 genes that were 
over - represented in the genom e (Table I), s uggesting 



of their cognate proteins. To extend these findings 
beyond genomic over-representation to expression of 
the protein products of those genes, we next assessed 
protein expression in the same ceil lines by two- 
dimensional polyacrylamide gel electrophoresis (PAGE) 
and found that 42 different proteins, representing 42 
individual genes, were significantly increased in the 
cancer cell lines (Table 2; Supplementary Figures IS and 
2S). Some of these proteins were identified as having 
multiple isoforms, and all individual isoforms exhibited 
increases in expression ranging from 4.6 to 12.8 times 
their expression in normal lung tissue cells. In compar- 
ing protein level of the 42 genes with changes in their 
cognate genomic and mRNA expression from the global 
microarray analyses, we found that four (9.5%) of those 
42 genes - PRDXI, EEFIA2. CALR, and KCIP-1 - 
showed statistically significant correlations between 
elevated protein expression and increases in both copy 
number and mRNA expression (all r> 0.84;: P<0.05) 
(Table 2) in the cancer cell lines. These findings imply 
that the abundance of these four proteins is attributable 
to the amplification and consequent elevated transcrip- 
tion of their cognate genes. 



Validation of copy number, transcript, and protein 
expression of PRDX1. EEFIA2, CALR, and KCIP-I 
in lung cancer cell lines 

To confirm our findings from the high-throughput 
analyses, we next used Southern, Northern, and Western 
blotting to assess DNA, RNA, and protein levels for the 
four genes identified in the six cell lines. For compar- 
ison, we arbitrarily chose one gene, NFKBI, in which an 
increase in protein level did. not correlate with genetic 
changes. Overall, we found excellent concordance 
between the CGH microarray and Southern blotting 
analyses, transcript array and Northern blotting ana- 
lyses, and proteomic and Western blotting analyses for 
all five genes (Figure I). For example, KCfP-1 showed 
fivefold amplification in five of the six cancer cell lines. 
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that elevated transcript levels of the 183 genes may 
reflect their genomic over-representation in the cancer 
cells. These findings are consistent with previous reports 
linking genomic changes with altered transcript patterns 
in breast cancer (Hyman et al., 2002; Pollack et aL 
2002). However, our finding that only 31% of the genes 
showing increased DNA copy numbers had cognate 
increases in transcript expression in lung adenocarcino- 
mas is different from the overall rates of 40-60% 
reported for breast cancer (Hyman et aL, 2002; Pollack 
et aL. 2002). This discordance may reflect methodologic 
differences between studies or biological differences 
between breast cancer and lung adenocarcinoma. 

Proteomic analyses identify four genes for which protein 
abundance was associated with increases in the cognate 
gene and transcript levels 

Analysis of transcript patterns is insufficient for under- 
standing the expression of protein products and the 

effect of genomic o\ er-apresciKaiion mi the expression 



whereas NFKBI showed no such increase in any of the 
cell lines. As for transcript expression, Northern blotting 
of EEFJA2 showed high expression in five of the six 
cancer cell lines; again, levels of NFKBI transcript were 
not increased in any cancer cell line as compared with 
normal bronchial epithelial cells. The results of Western 
blotting were also consistent with the results of the 
proteomic experiments; for example, five of the cancer 
cell lines exhibited strong protein bands for PRDXI as 
compared with normal cells. These findings provide 
strong support for the validity of the results derived 
from the high-throughput techniques in this study. 

These parallel analyses also revealed close correla- 
tions in the extent of changes in gene copies, transcript, 
and protein of each of the four genes in the cancer cell 
lines. For example, in the five cancer cell lines that 
showed at least fourfold increases in EEFIA2 copy 
number, expression of transcript and protein was also 
increased by at least a factor of four as well (relative to 
their expression in normal cellsH Supplementary Figure 
3S). The protein abundance of the four genes showing 
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Table I List of 183 genes with statistically significant correlation 
(0.05) between genomic copy number an d transcript level 

Gene .symbol Cbro. Distance from p arm of each a 
chromosome ( Mb) 



Table I (amiinued) 



Gene symbol 



FNOI. 
ODOST 

sr'N. 

MLP 

A K R I A I 

PRDXI 

UQCRH 

RPL7 

COLIIA1 

MCI. I 

PSMB4 

JTB 

RPS27 

HAXI 

MUCI 

CXT3 

(. RABP2 

TKT 

ATPIRI 

c -urn 

SNRPF. 
YVVHAQ 

odci 

RPI,3I 
BENE 
STAT I 
HSPDI 
HSPEI 
RPL37A 
IGFBP2 
RPS7 
RABIA 
IGKC 
I.TF 
PFN2 
KPNA4 . 
SI OOP 
UGDH 
UCHLI 
SPPI 
TRIM2 
l-GB 
FC»0 



SDH A 

PDCD6 

CCTS 

PTPRF 

RPL37 

P.NCI 

QP-C 

SPINK I 

CANX 

SOX4 

HDGF 

RPSIO 

RP1J0A 

VEGF 

OS I -2 

l-SCNI 

CYCS 

CBX3 

IGFBP3 

CL.DN4 

HSPBt 

CALR 

COLIA2 

ATP5J2 

'AKRIBIO 



2 
2 
2 
2 
2 
2 

2 
2 
2 
2 
3 
3 
3 
4 
4 
4 
4 
4 
4 

_4_ 



5 
5 

5 

5 

5 

5 

5 

5 

5 

6 

6 

6 

6 

6 

6 

7 

7 

7 

7 

7 

7 

7 

7 

7 



8.5 
20,1 
26.4 
32.2 
45.4 
45.4 
46.2 
96.4 
102.6 
147.3 
148.1 
150.7 
150.7 
151 
151.9 
153.1 
153.4 
159.3 
165.8 
199.7 
200.2 
9.6 
10.60 
101.20 
110.40 
191.80 
198.30 
198.30 
217.30 
217.50 
3.30 
65.30. 
89.00 
46.3 
151 
161.5 
6.7 
39.3 
41 I 
89.3 
154.7 
156 . 
IS* 



0.251 
0.305 

10.3 

14.2 

40.8 

74 

132.2 

147.2 

179.2 
21.7 
. 22.6 

34.6 

35.4 

43.7 . 

45.4 
5.3 

24.9 

25.9 

45.7 

72.7 

75.5 

92.7 

93.6 

9X.7 
133.6 



0.0085 
0.01 1 1 
0.0113 
0.0M4 
0.0128 
0.0122 
0.0125 
0.0127 
0.0129 
0.0222 
0.0131 
0.0134 
0.0135 
0.0266 
0.0143 
0.0167 
0.0148 
0.0152 
0.0234 
0.0154 
0.0165 
0.0159 
0.0119 
0.0161 
0.0169 
0.0175 
0.0277 
0.0185 
0.0388 
0.0189 
0.0193 
0.0204 
0.0285 
0.0455 
0.0207 
0.021 1 
0.M22 
0.0215 
0.0222 
0.0227 
0.0231 
0.0235 
■ 0 0441 



0.0243 

0.0245 

0.0446 

0.0248 

0.0251 

0.0336 

0.0466 

0.0256 

0.0263 

0.0321 

0.0362 

0.0177 

0.0369 

0.0372 

0.0173 

0.0378 

0.0381 

0.0289 

0.0389 

0.0403 

0.0433 

0.0425 

0.0457 

0.0475 

0.0481 



RPS20 

TCEBI 

LAPTM4B 

RPL30 

KCIP-I 

PABPCI 

EEFID 

TSTA3 

RPL8 

TRAI 

RPL35 

HSPA5 

LCN2 

DPP7 

PFKP 

AKRICi 

PLAU 

DSP 

TALDOI 

SLC22AIL 

TSSC3 

RPL27A 

ST5 

LDHA 

MDK 

DOC-IR 

MMPI2 

HYOUI 
. SCNNIA 

LDHB 

KRT7 

KRT5 

KRT6E 

ERBB3 

NACA 

TM4SF3 

NTS 

ASCLI 

TXNRDI 

CKAP4 

COX6AI 

BGN 

RAN 

RPL36A 



C/tro. Distance from p arm of each 
chromosome (Mb) 



PGD 

THBS2 

TRAF4 

SPINTI 

RPLI7 

PKM2 

IDH2 

RPL23A 

MSLN 

UBE2I 

RPS2 

CLDN9 

ARL6IP 

EIF3S8 

TUFM 

ALDOA 

NME4 

GPR56 

CDHI 

NQOI 

SLC7A5 

APRT 

GALNS 

RPLI3 

MCP 



8 
8 
8 
8 
8 
8 
8 
8 
8 
9 
9 
9 
9 
9 
10 
10 
10 
10 
II 
II 
II 
II 
II 
II 
II 
II 
II 
II 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 
12 



56.7 
74.6 
98.5 
98.7 

101.6 

101.78 

144.4 

144.5 

145.6 

JJ7.I 

121. 1 

121.5 

124.4 

133.4 
3.2 
5.1 

75.6 

76.7 
0.434 
2.9 * 
2.9 
8.7 
8.8 

18.5 

46.4 

67.5 
102.8 
118.9 
6.3 

21.7 

52.3 

52.6 

52.6 

56.2 

56.8 

71.2 

86.2 
103.3 
104.6 
106.6 
120.7 
122.5 
129.88 



0.0482 
0.0486 
0.0497 
0.0054 
0.0093 

0:0119 

0.0121 
0.0122 
0.0128 
0.0136 
0.0133 
0.0135 
0.0137 
0.0139 
0.0223 
0.0146 
0.0356 
0.0289 
0.0143 
0.0151 
0.0611 
0.0156 
0.0162 
.0.0168 
0.0162 
0.0167 
0.0177 
0.0183 
0,0185 
0.0193 
0.0196 
0.0197 
0.0201 

0:0212 

0.0218 
0.0401 
0.0215 
0.0219 
0.0223 
0.0124 
0.0435 
0.0235 
0.0238 



— M 


48H 


0.0243 


14 


50.7 


0.0248 


15 


37.5 


0.0251 


15 


38.3 


0.0253 


15 


38.7 


0.0254 


15 


45.26 


0.041 1 


15 


70.1 


0.0258 


15 


88.2 


0.021 1 


16 


0.377 


0.0264 


16 


0.753 


0.0366 


16 


1.3 


0.0271 


16 


1.95 


0.0281 


16 


3.1 


0.0329 


16 


18.7 


0.0412 


16 


28.3 


0.0336 


16 


28.9 


0.0377 . 


16 


30.1 


0.038 


16 


53.6 


0.0381 


16 


57.4 


0.0386 


16 


68.5 


0.0289 


16 


69.5 


0.0396 


16 


87.6 


0.0397 


16 


88.6 


0.0411 


16 * 


88.6 


0.0255 


16 


89.3 


0.0431 


17 


32.4 


0.0465 
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Table I {continued) 



(Jcnc symhnl Chro. Distance from p arm of each 

chromosome (Mb) 



FRBB2 

JUP 

ORF 

RPL27 

NMEI 

COL I A' I 

ABCC3 

N M H2 

RPL38 

SMT-JH'2 

SYNGR2 

LGALS3BI* 

F4HB 

PPAP2C 

GPI 

HPN 

ZNF146 

SPINT2 

PSMDX 

YIFIP 

R PS 1 6 

c:i-:ac:am5 

CEACAM6 

GIPR 

SNRPD2 

KDELRI 

RPL2X 

RPS.S 

TRIM2S 

DAP 

TOPI 

UBL2C 

RPS2I 

EHFM2 

TFF3 

TI-FI 

CSTB 

M1F 

XBPI 

PRDX4 

SYNI 

TIMPI 

PLP2 

MAfiFDI 



RPS4X 
SSR4 



17 
17 

17 

17 

17 

17 

17 

17 

17 

17 

17 

17 

17 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

19 

20 

20 

20 

20 

20 

21 

21 

21 

22 

22 

X 

X 

X 

X 

X 



35.11 
39.8 
40.39 
4LI 
46.59 
48.6 
49.1 
49.6 
72.7 
73.6 
76.6 
77.4 
80.3 
0.221 
39.55 
40.2 
41.4 
43.4 
43.5^ 
43.5 
44.6 
46.9 
46.9 
50.8 
50.9 . 
53.6 
60.6 
63.6 
63.7 
35.6 
40.3 
45.1 
61.6 
62.8 
42.6 
42.7 
44.1 
22.6 
27.5 
22.9 
46.3 
46.3 
47.8 



X 
X 



71 

152.6 



0.0483 
0.0495 
0.0505 
0.0046 
0.0082 
0.0108 
0.0326 
0.01 1 1 
0.0117 
0.0119 
0.0122 
0.0127 
0.0126 
0.0228 
0.0145 
0.0129 
0.0131 
0.0238 
0.0132 
0.0135 
. 0.0144 
0:0145 
0.0143 
0.0259 
0.0413 
0.0152 
0.0156 
0.0267 
O r OI58 
0.0166 
0.0172 
0.0174 
0.0268 
0.0185 
0.0186 
0.0192 
0.0201 
0.0202 
0.0204 
0.0198 
0.0204 
0.0209 
0.0212 
■ 0.0331 



0.0124 
0.0232 



corresponding increases in both DNA copy number and 
mRNA provides further evidence that these could be 
oncogenes, the activation of which is reflected by 
genomic amplification and consequent increases in 
transcript level in lung adenocarcinoma cell lines. 

Specific inhibition of EEFIA2 and KCIP-I expression hv 
siRNAs led to decreased cell proliferation and induction of 
apoptosis 

To further prove the oncogenic function of the identified 
genes in lung tumorigenesis, we used siRNAs to inhibit 
the endogenous expression of EEFIA2 and KCJP-1 
protein in four lung cancer cell lines (HI 563, H229, 
H522, and SK-LU). Transfection of the cancer cells with 
specific siRNAs reduced the level of EEFIA2 and 
KCIP-I protein by 70 90% 48 h after transfection 



(Supplementary Figure 4S). In contrast, EEFJA2 and 
KCIP-I protein levels remained unchanged in mock- 
treated control cells and in cells transfected with a 
scrambled siRNA sequence. At 48 h after siRNA 
transfection, the percentage of proliferation of the 
transfected cancer cells was reduced to 15-30% as 
compared with 91-100% of cell proliferation of the 
same cell lines treated with PBS or scrambled siRNA 
(Supplementary Figure 5S). Apoptosis of siRNA- 
transfected cells was 27-34%, whereas only 4% of the 
same cell lines treated with PBS or scrambled siRNA 
showed apoptosis. These results strongly support an 
oncogenic role for the identified genes in lung cancer and 
confirm their potential usefulness as therapeutic targets 
for the disease. 



Amplification and protein expression of KCIP-I and 
EEFI A 2 in lung tissue 

To further validate these findings and to assess the 
possible clinical significance of the four potential 
putative oncogenes identified from the cell lines, we first 
applied fluorescence in situ hybridization and immuno- 
histochemical analysis, in parallel, to commercially 
available human lung tissue microarrays (Ambion, 
Austin, TX, USA) to evaluate the status of two of these 
four genes in lung cancer tissue specimens. (Commer- 
cially available antibodies to PRDXI or CALR were 
not suitable for use in immunohistochemical analysis 
when this report was written.) Overexpression of KCIP- 
I and EEFI I A2 protein in the tumors was concordant 
with amplification of the corresponding genes 
(/> = 0.0003 for KCIP-1 and P = 0.00\l for EEFI A2). 
For example, 16 (35%) of the 46 lung adenocarcinomas 
in the microarray showed amplification of KCIP-I, and 
strong cytoplasmic staining for KC1P-J protein was seen 
in 18 tumors (39%) (Figure 2). We next examined 
whether overexpression of these genes was associated 
with increased cell proliferation by analysing Ki-67 
expre ss ion in contiguous sections of the tissue micro - 
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arrays. Positive Ki-67 expression was found to correlate 
with positive expression of both KCIP-I (P=:0.02) and 
EEFIA2 (P = 0.0\). To extend these findings, we then 
studied 1 1 tissue microarray blocks comprising normal 
and tumor tissue specimens from 113 patients with 
pathologic stage I rion-small-cell lung cancer who had 
undergone curative surgery (Wang et al. t 2005). 
Immunohistochemical analysis showed that EEF1A2 
was expressed in 32 cases (28%) and KCIP-I in 29 cases 
(26%). Univariate and multivariate Cox proportional 
hazards models were used to detect possible associations 
between EEFIA2 and KCFP-I expression and clinico- 
pathologic variables. Expression of EEFI A2 or KCIP-I 
was associated with short overall survival time 
(P = 0.0012 for EEFIA2 and P = 0.0026 for KCIP-I) 
(Supplementary Figure 6S). Age at diagnosis, histologic 
type of cancer, degree of tumor differentiation % and 
smoking history were not associated with survival time. 

Although only two genes were validated in the lung 
tissue microarrays (because available antibodies to 
the other two genes were not suitable for use in 



Oncogene 



Best Available Copy 



Identifying oncogenes in lung adenocarcinoma 
R Li et al 



Tabic 2 Proieins showing significant ovcrcxpression in cancer cell lines relative to those in normal bronchial epithelial cell lines and their 
. correlation coefficients with increased DNA copy number or mRNA values" 
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Only the gene showing statistically significant increased protein expression with increases in both genomic copy number and transcript 
.simultaneously will be considered as potential putative oncogene in lung adenocarcinoma cells. V, Spearman correlation coefficients between 
proteins and genomic or mRNA values are based on all six cancer cell lines; bold indicates /><0.05, if r > 0.840(H). Mw, molecular weiehr p/ 
isoelectric point. . & < i - 



immunohistochemical analysis), these findings are con- 
sistent wilh those from our cell lines, demonstrating 
again that genomic amplification and consequent 
increases in amounts of transcript may be, at least in 
part, driving the abundance of proieins in these lung 
tumors. The association between expression of these 
genes and that of Ki-67, a known indicator of poor 
prognosis in lung cancer (Martin et aL 2004), suggests 
that activation of these genes may be an indicator of 
tumor aggressiveness. These results also suggest that 
expression of EEFI A2 and KC1P-1 proteins in stage 1 
non-small-cell lung cancer may be useful as a marker for 
distinguishing patients with relatively poor prognosis 
from those who might benefit from adjuvant treatment. 
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Discussion 

Our current study illustrates the power of integrated 
functional genomic analyses for identifying putative 
oncogenes and for evaluating their potential clinical 
significance. Among the four identified oncogenes, three 
genes (PRDXI, CALR, and KCIP-I) have been im- 
plicated in lung tumorigenesis. PRDXI is an antioxidant 
protein involved in regulating cell proliferation, differ- 
entiation, and apoptosis. Kim et aL (2003) found 
PRDXI expression to be elevated in both lung cancer 
and adjacent normal lung tissue, suggesting that 
activation of PRDXI may enhance proliferation in lung 
cancer. CALR has a major role in Ca 2+ binding and the 
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Figure I Confirmation by Southern. Northern, and Western blot 
analyses of increased ONA copies, transcript levels, and protein 
levels in the four genes identified in high- throughput analyses. For 
comparison, we arbitrarily chose one gene. NFKBl t in which an 
increased protein levci did not correlate with genetic changes. The 
blotting results are consistent with the results from the COM array, 
transcript array, and proteomic analyses. Nor, indicates normal 
bronchial epithelial cell line. All the experiments were repealed at 
least three limes wilh each cell line. Means of normalized to /?-actin 
signal intensities on Southern, Northern, and Western blots, along 
with 95% confidence intervals, were calculated </f-actin signals are 
not shown in the figure; two different normal bronchial epithelial 
cell lines were used in the confirmation and only one normal. cell 
line is shown in the figure J. 



transcriptional regulation of other genes and was 
^recently found to be bverexpressed in 73% of 40 lung 
adenocarcinomas (Oates and Edwards, 2000). KCIP-I 
belongs to the 1 4-3-3 family, which participates via the 
MAPK and Wnt signaling pathways in the regulation of 
many cellular processes including cell proliferation and 
differentiation as well as tumorigenesis (Thomas et aL, 
2005). KCIP-1 was recently found to be expressed in all 
12 lung tumors tested in a single-institution study (Qi 
et <//., 2005). Interestingly, EEFIA2 was originally 
considered a putative oncogene in ovarian cancer on 
the basis of its being amplified in 25% and over- 
expressed in 30% of the same set of ovarian tumors 
(Anand et «/., 2002); functional analyses have estab- 
lished its oncogenic role in cellular transformation (Lee, 
2003). Our discovery that EEFIA2 may be a putative 
oncogene in lung adenocarcinoma demonstrates the 
power of our functional genomic strategy for rapidly 
identifying potential oncogenes. 

Although the main focus of this study was to 
specifically identify putarive- oncogenes, it should be 



noted that 90.7% of the genes showing high protein 
expression did not show corresponding increases 
in both DNA copy number and transcript, a finding 
consistent with that of others that transcriptional, 
translational, and post-transiational regulatory mecha- 
nisms can greatly influence the abundance of protein 
in lung tumorigenesis (Chen et a/., 2002). For example, 
NFKBl is a critical arbiter of immune responses, 
cell survival, and transformation and is often activated 
in several types of tumors (Chen et aL, 2002); De- 
regulation of NFKBl is thought to be modulated 
through phosphorylation of Ser337 by protein kinase 
A (Chen et a/., 2002). In our study, 68.8% of the 
genes showing over-representation in the genome 
did not show elevated transcript levels, implying 
that at least some of these genes are 'passenger' genes 
that are concurrently amplified because of their 
location with respect to amplicons but lack bio- 
logical relevance in terms of the development of lung 
adenocarcinoma. 

Although the potential oncogenes we identified here 
are likely to be important, certainly other oncogenes 
could be involved in the development of lung adeno- 
carcinoma. The oligo microarray we used consists of 
22000 probes/which represent only about 60% of the 
human genome. Moreover, each probe was designed for 
the 3' region of expressed sequence tags of the selected 
genes. Also, our results were initially derived from 
cancer ceil lines, although the findings were later 
confirmed in human tissue samples. Our ongoing study 
using microarrays with information on more genes 
and the development of high-resolution proteomic 
analyses for use with larger numbers of specimens will 
allow more comprehensive analyses of the molecular 
consequences of gene amplifications. Such expanded 
analyses will very likely lead to the identification of 
additional oncogenes. 

Some of the results of our current study were 
comparable to those of other studies of lung cancer. 
For example, genomic ropy n u mber and protein level s 
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of KCTP-I were previously found to be amplified and 
overexpressed in primary lung cancers by cDNA clone- 
based CGH array analysis (Jiang et aL, 2004) and 
proteomic analysis (Chen et ai y 2002), respectively. Our 
functional genomic approach, which integrates simulta- 
neous CGH, transcript microarrys, proteomic analyses, 
and siRNA, allows us not only to quickly identify 
potential oncogenes but also to explore their significance 
as diagnostic and therapeutic targets in tumor progres- 
sion — more than could be achieved by any technique 
alone. 

Genes identified in this way may serve as promising 
targets for diagnosis and therapy in lung adenocarci- 
noma. Further research on the clinical implications of 
such genes is needed; experiments now underway in our 
laboratory include overexpression of the genes in 
normal cells, disruption of the function of these genes 
in cancer cells, and investigation of how interactions 
among these genes (or interactions with other known 
oncogenes) may mediate the expression of the trans- 
formed phenotype. 
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Materials and methods 

Cell lines 

Six human lung adenocarcinoma cell lines (H23, H229, HI 792, 
SK-LU-I. H522. and HI563) were obtained from the 
American Type Culture Collection (Manassas, VA, USA). 
Two norma l - bronch i al ep i thel i al ce ll l i nes were obta in ed f r om 



. Clontech (Palo Alto, CA, USA). Genomic DNA, mRNA, and 
protein were derived from a single harvest of these cells. 

ON A and RNA profiles by microarray analysis 
Genomic DNA labeling and hybridization' were performed as 
described previously (Barrett et <//.. 2004) with Agilent's 
Human I A Oligo Microarray (V2) (Agilent Technologies, 
Palo Alio, CA, USA), which- contains 22 000 unique 60-mer 
oligos. Details ol" the protocol for analysing transcripts are 
available al hiip://www.chem. agilent.com. Map positions for 
arrayed genes were assigned by identifying the DNA sequence 
represented in the UniGene cluster and matching it with the 
Golden Path genome assembly (http://genome.ucsc.edu/; Mat 
7, 2004 Freeze). Microarray images of DNA copy number and 
expression were analysed by using AgilcntCGH Analytics and 
Feature Extraction software. DNA copy number profiles that 
deviated significantly from background signal ratios (measured 
from normal control cell hybridization, as described elsewhere; 
Barren et at.. 2004) were interpreted as evidence of true 
differences in DNA copy number. The criteria for defining 
genomic ovcr-reprcscnialion and amplicons are described 
vlsewherc (Hymnn t -t ,,■/.. 2002): details arc ..riven in the . 



Supplementary Information. An increase in mRNA level was 
defined as a twofold increase in signal ratio relative to that of 
the control (log 2 > I). 

Quantitative two-dimensional PAGE and mass spectrometry 
Analysis of proteins by two-dimensional PAGE and their 
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identification by mass spectrometry were performed as 
previously described (Shen et al, % 2004). Briefly, protein pellets 
were solubilized in rehydration buffer, after which the first- 
dimension isoelectric focusing was carried out with a Protean 
IEF Cell (Bio-Rad Laboratories) and the second-dimension 
separation was carried out with Bio-Rad's Ready Gel Precast 
Gels and the Bio-Rad Criterion Cell apparatus. Protein spots 
were visualized by silver-based staining, and all gels were 
assessed with Bio-Rad's PDQuest 2D gel image analysis 
software. Selected spots were subjected to in-gel tryptic 
digestion and analysed on a Voyager-DE PRQ matrix-assisted 
laser desorptipn ionization/time-ofrflighl mass spectrometer 
(Applied Biosystems, Foster City, CA, USA). The mass list of 
the 20 most intense monoisotopic peaks for each sample was 
entered in the MS-Fit search program (v3.2.l) (http:// 
prospector.ucsf.edu/ucsfhtml4.0/msfit.htm) and searched in 
the National Center for Biotechnology Information protein 
database. 

Southern. Northern, and Western blot analyses 
Southern, Northern, and Western blot hybridizations were 
performed according to standard protocols. cDNA clones for 
the tested genes were purchased from Invitrogen (Carlsbad, 
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CA. USA) and prepared as probes for the blot hybridizations. 
Antibodies used were obtained as follows: PR DX I , CALR, 
NFKBI, KCIP-I. and /7-actin from Santa Cruz Biotechnology 
(Santa Cruz, CA. USA); and EEFIA2 from Upstate Biotech- 
nology (Wultham. MA, USA). 

Fluorescence in situ hybridization and immunohistochemical 
analyses of lung tissue microarrays 

Fluorescence in situ hybridizations and immunohistochemical 
analyses of KCIP-I and EEFIA2 were carried out as described 
elsewhere (Jiang et aL 2002; Wang et aL 2005) with Lung 
Tissue Microarrays (Amnion, Austin, TX, USA) and II 
homemade microarray blocks containing tissue samples from 
113 patients with pathologic stage I non-small-ccll lung cancer 
(Wang et aL 2005). DNA probes specific for KCIP-I and 
EEFIA2 were obtained by screening a Human BAC Clone 
library (Invitrogcn) by polymerase chain reaction as described 
previously <Jiang et aL. 2002), The antibodies used for the 
immunohistochemical analyses were the same as those used 
for the Western blotting. Cell proliferation of the lung tissues 
was assessed with a Ki-67 monoclonal antibody from Santa 
Cruz Biotechnology. Definitions of the eutofT value for a 
positive result of each antibody are shown in Supplementary 
Information. 

siRNA transfection, cellular proliferation ussay, and apoptosis 
analysis 

Translations were carried out by using siPORT Lipid 
Transaction Agent (Ambion) with siRNAs targeting KCIP-I 
or EEFIA2 or with a scrambled siRNA duplex (siControl) 
(Dharmacon Inc., Lafayette, CO, USA), with PBS used as a 
negative control (Jiang et aL 2002). Cells were fixed 24, 48. or 
96 h later and subjected to further tests. All siRNAs were 
prepared by using a transcription-based method with Silencer 
siRNA according to the manufacturer's instructions (Am- 
bion). Sequences of the individual siRNAs are listed in 
Supplementary Table 4S. Inhibition of cell growth by the 
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